Overview

Dataset statistics

Number of variables31
Number of observations79293
Missing cells556115
Missing cells (%)22.6%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory18.8 MiB
Average record size in memory248.0 B

Variable types

Categorical23
Numeric7
Boolean1

Warnings

Event.Id has a high cardinality: 78143 distinct values High cardinality
Accident.Number has a high cardinality: 79293 distinct values High cardinality
Event.Date has a high cardinality: 12638 distinct values High cardinality
Location has a high cardinality: 25264 distinct values High cardinality
Country has a high cardinality: 177 distinct values High cardinality
Airport.Code has a high cardinality: 9630 distinct values High cardinality
Airport.Name has a high cardinality: 22760 distinct values High cardinality
Injury.Severity has a high cardinality: 124 distinct values High cardinality
Registration.Number has a high cardinality: 68960 distinct values High cardinality
Make has a high cardinality: 7475 distinct values High cardinality
Model has a high cardinality: 11330 distinct values High cardinality
Air.Carrier has a high cardinality: 2866 distinct values High cardinality
Publication.Date has a high cardinality: 3591 distinct values High cardinality
Engine.Type is highly correlated with Aircraft.Category and 5 other fieldsHigh correlation
Aircraft.Category is highly correlated with Engine.Type and 1 other fieldsHigh correlation
Total.Uninjured is highly correlated with Investigation.Type and 2 other fieldsHigh correlation
Aircraft.Damage is highly correlated with Broad.Phase.of.Flight and 3 other fieldsHigh correlation
Latitude is highly correlated with Report.Status and 2 other fieldsHigh correlation
Broad.Phase.of.Flight is highly correlated with Aircraft.Damage and 1 other fieldsHigh correlation
Total.Serious.Injuries is highly correlated with Total.Fatal.Injuries and 1 other fieldsHigh correlation
Report.Status is highly correlated with Latitude and 2 other fieldsHigh correlation
Total.Fatal.Injuries is highly correlated with Total.Serious.InjuriesHigh correlation
Investigation.Type is highly correlated with Engine.Type and 4 other fieldsHigh correlation
FAR.Description is highly correlated with Engine.Type and 10 other fieldsHigh correlation
Longitude is highly correlated with Latitude and 2 other fieldsHigh correlation
Purpose.of.Flight is highly correlated with Engine.Type and 2 other fieldsHigh correlation
Total.Minor.Injuries is highly correlated with Total.Serious.InjuriesHigh correlation
Number.of.Engines is highly correlated with Engine.Type and 3 other fieldsHigh correlation
Schedule is highly correlated with Engine.Type and 6 other fieldsHigh correlation
Engine.Type is highly correlated with ScheduleHigh correlation
Report.Status is highly correlated with FAR.DescriptionHigh correlation
Investigation.Type is highly correlated with FAR.Description and 2 other fieldsHigh correlation
FAR.Description is highly correlated with Report.Status and 2 other fieldsHigh correlation
Purpose.of.Flight is highly correlated with ScheduleHigh correlation
Schedule is highly correlated with Engine.Type and 3 other fieldsHigh correlation
Aircraft.Damage is highly correlated with Investigation.TypeHigh correlation
Latitude has 53542 (67.5%) missing values Missing
Longitude has 53551 (67.5%) missing values Missing
Airport.Code has 34630 (43.7%) missing values Missing
Airport.Name has 31857 (40.2%) missing values Missing
Aircraft.Damage has 2410 (3.0%) missing values Missing
Aircraft.Category has 56816 (71.7%) missing values Missing
Registration.Number has 3084 (3.9%) missing values Missing
Number.of.Engines has 4118 (5.2%) missing values Missing
Engine.Type has 3374 (4.3%) missing values Missing
FAR.Description has 56959 (71.8%) missing values Missing
Schedule has 67792 (85.5%) missing values Missing
Purpose.of.Flight has 3894 (4.9%) missing values Missing
Air.Carrier has 75375 (95.1%) missing values Missing
Total.Fatal.Injuries has 23309 (29.4%) missing values Missing
Total.Serious.Injuries has 25551 (32.2%) missing values Missing
Total.Minor.Injuries has 24460 (30.8%) missing values Missing
Total.Uninjured has 12344 (15.6%) missing values Missing
Weather.Condition has 2157 (2.7%) missing values Missing
Broad.Phase.of.Flight has 6054 (7.6%) missing values Missing
Publication.Date has 13474 (17.0%) missing values Missing
Total.Fatal.Injuries is highly skewed (γ1 = 29.51903889) Skewed
Total.Serious.Injuries is highly skewed (γ1 = 37.45361961) Skewed
Total.Minor.Injuries is highly skewed (γ1 = 66.84890696) Skewed
Event.Id is uniformly distributed Uniform
Accident.Number is uniformly distributed Uniform
Accident.Number has unique values Unique
Number.of.Engines has 1143 (1.4%) zeros Zeros
Total.Fatal.Injuries has 40092 (50.6%) zeros Zeros
Total.Serious.Injuries has 42660 (53.8%) zeros Zeros
Total.Minor.Injuries has 40064 (50.5%) zeros Zeros
Total.Uninjured has 19126 (24.1%) zeros Zeros

Reproduction

Analysis started2021-09-06 20:01:23.497037
Analysis finished2021-09-06 20:01:43.073386
Duration19.58 seconds
Software versionpandas-profiling v3.0.0
Download configurationconfig.json

Variables

Event.Id
Categorical

HIGH CARDINALITY
UNIFORM

Distinct78143
Distinct (%)98.5%
Missing0
Missing (%)0.0%
Memory size619.6 KiB
20001214X45071
 
3
20101022X34140
 
3
20100204X45658
 
3
20001212X19172
 
3
20001212X20092
 
2
Other values (78138)
79279 

Length

Max length15
Median length15
Mean length15
Min length15

Characters and Unicode

Total characters1189395
Distinct characters12
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique76997 ?
Unique (%)97.1%

Sample

1st row20170103X43747
2nd row20161230X55950
3rd row20161229X93022
4th row20161227X80237
5th row20161226X80840

Common Values

ValueCountFrequency (%)
20001214X45071 3
 
< 0.1%
20101022X34140 3
 
< 0.1%
20100204X45658 3
 
< 0.1%
20001212X19172 3
 
< 0.1%
20001212X20092 2
 
< 0.1%
20001211X09748 2
 
< 0.1%
20001213X27739 2
 
< 0.1%
20001205X00455 2
 
< 0.1%
20080826X01320 2
 
< 0.1%
20001208X08229 2
 
< 0.1%
Other values (78133)79269
> 99.9%

Length

2021-09-06T13:01:43.196185image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
20100204x456583
 
< 0.1%
20001212x191723
 
< 0.1%
20001214x450713
 
< 0.1%
20101022x341403
 
< 0.1%
20070518x005872
 
< 0.1%
20001211x147692
 
< 0.1%
20010821x017392
 
< 0.1%
20001214x398322
 
< 0.1%
20040714x009692
 
< 0.1%
20001213x334892
 
< 0.1%
Other values (78133)79269
> 99.9%

Most occurring characters

ValueCountFrequency (%)
0317120
26.7%
2202726
17.0%
1183765
15.5%
X79293
 
6.7%
79293
 
6.7%
365991
 
5.5%
459066
 
5.0%
542996
 
3.6%
741448
 
3.5%
840161
 
3.4%
Other values (2)77536
 
6.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number1030809
86.7%
Uppercase Letter79293
 
6.7%
Space Separator79293
 
6.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0317120
30.8%
2202726
19.7%
1183765
17.8%
365991
 
6.4%
459066
 
5.7%
542996
 
4.2%
741448
 
4.0%
840161
 
3.9%
639035
 
3.8%
938501
 
3.7%
Uppercase Letter
ValueCountFrequency (%)
X79293
100.0%
Space Separator
ValueCountFrequency (%)
79293
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common1110102
93.3%
Latin79293
 
6.7%

Most frequent character per script

Common
ValueCountFrequency (%)
0317120
28.6%
2202726
18.3%
1183765
16.6%
79293
 
7.1%
365991
 
5.9%
459066
 
5.3%
542996
 
3.9%
741448
 
3.7%
840161
 
3.6%
639035
 
3.5%
Latin
ValueCountFrequency (%)
X79293
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1189395
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0317120
26.7%
2202726
17.0%
1183765
15.5%
X79293
 
6.7%
79293
 
6.7%
365991
 
5.5%
459066
 
5.0%
542996
 
3.6%
741448
 
3.5%
840161
 
3.4%
Other values (2)77536
 
6.5%

Investigation.Type
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.6 KiB
Accident
76118 
Incident
 
3175

Length

Max length8
Median length8
Mean length8
Min length8

Characters and Unicode

Total characters634344
Distinct characters8
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowAccident
2nd rowAccident
3rd rowAccident
4th rowAccident
5th rowAccident

Common Values

ValueCountFrequency (%)
Accident76118
96.0%
Incident3175
 
4.0%

Length

2021-09-06T13:01:43.317176image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-06T13:01:43.357430image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
accident76118
96.0%
incident3175
 
4.0%

Most occurring characters

ValueCountFrequency (%)
c155411
24.5%
n82468
13.0%
i79293
12.5%
d79293
12.5%
e79293
12.5%
t79293
12.5%
A76118
12.0%
I3175
 
0.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter555051
87.5%
Uppercase Letter79293
 
12.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c155411
28.0%
n82468
14.9%
i79293
14.3%
d79293
14.3%
e79293
14.3%
t79293
14.3%
Uppercase Letter
ValueCountFrequency (%)
A76118
96.0%
I3175
 
4.0%

Most occurring scripts

ValueCountFrequency (%)
Latin634344
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
c155411
24.5%
n82468
13.0%
i79293
12.5%
d79293
12.5%
e79293
12.5%
t79293
12.5%
A76118
12.0%
I3175
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII634344
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c155411
24.5%
n82468
13.0%
i79293
12.5%
d79293
12.5%
e79293
12.5%
t79293
12.5%
A76118
12.0%
I3175
 
0.5%

Accident.Number
Categorical

HIGH CARDINALITY
UNIFORM
UNIQUE

Distinct79293
Distinct (%)100.0%
Missing0
Missing (%)0.0%
Memory size619.6 KiB
MIA88LA098
 
1
NYC94LA060
 
1
DFW08FA218
 
1
MIA90FA017
 
1
NYC05LA081
 
1
Other values (79288)
79288 

Length

Max length11
Median length10
Mean length10.02896851
Min length9

Characters and Unicode

Total characters795227
Distinct characters37
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique79293 ?
Unique (%)100.0%

Sample

1st rowWPR17LA046
2nd rowWPR17FA044
3rd rowCEN17LA062
4th rowCEN17LA061
5th rowWPR17FA041

Common Values

ValueCountFrequency (%)
MIA88LA0981
 
< 0.1%
NYC94LA0601
 
< 0.1%
DFW08FA2181
 
< 0.1%
MIA90FA0171
 
< 0.1%
NYC05LA0811
 
< 0.1%
DEN83LA0681
 
< 0.1%
MIA03CA0911
 
< 0.1%
ANC88LA1461
 
< 0.1%
CHI99FA0531
 
< 0.1%
FTW03LA0921
 
< 0.1%
Other values (79283)79283
> 99.9%

Length

2021-09-06T13:01:43.486568image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
wpr10la0011
 
< 0.1%
chi93la2221
 
< 0.1%
iad01wa0751
 
< 0.1%
chi07ca2641
 
< 0.1%
lax83la1291
 
< 0.1%
chi99fa2931
 
< 0.1%
gaa15ca0101
 
< 0.1%
ftw00ra0621
 
< 0.1%
chi96fa1851
 
< 0.1%
sea91la1671
 
< 0.1%
Other values (79283)79283
> 99.9%

Most occurring characters

ValueCountFrequency (%)
A118328
14.9%
076986
 
9.7%
L61196
 
7.7%
160517
 
7.6%
845229
 
5.7%
943401
 
5.5%
238264
 
4.8%
C38136
 
4.8%
330922
 
3.9%
427059
 
3.4%
Other values (27)255189
32.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter402157
50.6%
Decimal Number392910
49.4%
Other Punctuation160
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A118328
29.4%
L61196
15.2%
C38136
 
9.5%
F23309
 
5.8%
N20941
 
5.2%
E19223
 
4.8%
I18638
 
4.6%
W14994
 
3.7%
T13753
 
3.4%
D11450
 
2.8%
Other values (16)62189
15.5%
Decimal Number
ValueCountFrequency (%)
076986
19.6%
160517
15.4%
845229
11.5%
943401
11.0%
238264
9.7%
330922
7.9%
427059
 
6.9%
525143
 
6.4%
623670
 
6.0%
721719
 
5.5%
Other Punctuation
ValueCountFrequency (%)
#160
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin402157
50.6%
Common393070
49.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
A118328
29.4%
L61196
15.2%
C38136
 
9.5%
F23309
 
5.8%
N20941
 
5.2%
E19223
 
4.8%
I18638
 
4.6%
W14994
 
3.7%
T13753
 
3.4%
D11450
 
2.8%
Other values (16)62189
15.5%
Common
ValueCountFrequency (%)
076986
19.6%
160517
15.4%
845229
11.5%
943401
11.0%
238264
9.7%
330922
7.9%
427059
 
6.9%
525143
 
6.4%
623670
 
6.0%
721719
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII795227
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A118328
14.9%
076986
 
9.7%
L61196
 
7.7%
160517
 
7.6%
845229
 
5.7%
943401
 
5.5%
238264
 
4.8%
C38136
 
4.8%
330922
 
3.9%
427059
 
3.4%
Other values (27)255189
32.1%

Event.Date
Categorical

HIGH CARDINALITY

Distinct12638
Distinct (%)15.9%
Missing0
Missing (%)0.0%
Memory size619.6 KiB
2000-07-08
 
25
1982-05-16
 
25
1984-06-30
 
25
1983-08-05
 
24
1983-06-05
 
24
Other values (12633)
79170 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters792930
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique542 ?
Unique (%)0.7%

Sample

1st row2017-01-03
2nd row2016-12-29
3rd row2016-12-27
4th row2016-12-27
5th row2016-12-26

Common Values

ValueCountFrequency (%)
2000-07-0825
 
< 0.1%
1982-05-1625
 
< 0.1%
1984-06-3025
 
< 0.1%
1983-08-0524
 
< 0.1%
1983-06-0524
 
< 0.1%
1986-05-1724
 
< 0.1%
1984-08-2524
 
< 0.1%
2001-07-2123
 
< 0.1%
2001-06-1623
 
< 0.1%
1982-10-0323
 
< 0.1%
Other values (12628)79053
99.7%

Length

2021-09-06T13:01:43.640987image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
2000-07-0825
 
< 0.1%
1982-05-1625
 
< 0.1%
1984-06-3025
 
< 0.1%
1983-08-0524
 
< 0.1%
1983-06-0524
 
< 0.1%
1986-05-1724
 
< 0.1%
1984-08-2524
 
< 0.1%
2001-07-2123
 
< 0.1%
2001-06-1623
 
< 0.1%
1982-10-0323
 
< 0.1%
Other values (12628)79053
99.7%

Most occurring characters

ValueCountFrequency (%)
0159256
20.1%
-158586
20.0%
1126056
15.9%
992245
11.6%
284143
10.6%
848463
 
6.1%
327198
 
3.4%
624806
 
3.1%
724363
 
3.1%
524355
 
3.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number634344
80.0%
Dash Punctuation158586
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0159256
25.1%
1126056
19.9%
992245
14.5%
284143
13.3%
848463
 
7.6%
327198
 
4.3%
624806
 
3.9%
724363
 
3.8%
524355
 
3.8%
423459
 
3.7%
Dash Punctuation
ValueCountFrequency (%)
-158586
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common792930
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0159256
20.1%
-158586
20.0%
1126056
15.9%
992245
11.6%
284143
10.6%
848463
 
6.1%
327198
 
3.4%
624806
 
3.1%
724363
 
3.1%
524355
 
3.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII792930
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0159256
20.1%
-158586
20.0%
1126056
15.9%
992245
11.6%
284143
10.6%
848463
 
6.1%
327198
 
3.4%
624806
 
3.1%
724363
 
3.1%
524355
 
3.1%

Location
Categorical

HIGH CARDINALITY

Distinct25264
Distinct (%)31.9%
Missing78
Missing (%)0.1%
Memory size619.6 KiB
ANCHORAGE, AK
 
372
MIAMI, FL
 
185
CHICAGO, IL
 
169
ALBUQUERQUE, NM
 
165
HOUSTON, TX
 
155
Other values (25259)
78169 

Length

Max length61
Median length12
Mean length12.94384902
Min length4

Characters and Unicode

Total characters1025347
Distinct characters80
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique14897 ?
Unique (%)18.8%

Sample

1st rowParadise, MT
2nd rowDabob, WA
3rd rowPiedmont, MO
4th rowFarmington, MO
5th rowFresno, CA

Common Values

ValueCountFrequency (%)
ANCHORAGE, AK372
 
0.5%
MIAMI, FL185
 
0.2%
CHICAGO, IL169
 
0.2%
ALBUQUERQUE, NM165
 
0.2%
HOUSTON, TX155
 
0.2%
Anchorage, AK140
 
0.2%
FAIRBANKS, AK138
 
0.2%
ORLANDO, FL114
 
0.1%
ENGLEWOOD, CO107
 
0.1%
TUCSON, AZ107
 
0.1%
Other values (25254)77563
97.8%

Length

2021-09-06T13:01:43.797941image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
ca8179
 
4.5%
tx5265
 
2.9%
fl5228
 
2.9%
ak5171
 
2.9%
az2554
 
1.4%
co2508
 
1.4%
wa2394
 
1.3%
il1914
 
1.1%
mi1896
 
1.0%
city1883
 
1.0%
Other values (12575)143853
79.5%

Most occurring characters

ValueCountFrequency (%)
101630
 
9.9%
,79108
 
7.7%
A74922
 
7.3%
N47000
 
4.6%
L45800
 
4.5%
E44628
 
4.4%
O41931
 
4.1%
I36346
 
3.5%
T34118
 
3.3%
R33380
 
3.3%
Other values (70)486484
47.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter609570
59.5%
Lowercase Letter232835
 
22.7%
Space Separator101630
 
9.9%
Other Punctuation80593
 
7.9%
Decimal Number464
 
< 0.1%
Dash Punctuation216
 
< 0.1%
Open Punctuation12
 
< 0.1%
Close Punctuation12
 
< 0.1%
Format9
 
< 0.1%
Control5
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a29434
12.6%
e25946
11.1%
n21291
9.1%
o19568
 
8.4%
l17958
 
7.7%
i17473
 
7.5%
r17236
 
7.4%
t13286
 
5.7%
s11330
 
4.9%
d7736
 
3.3%
Other values (17)51577
22.2%
Uppercase Letter
ValueCountFrequency (%)
A74922
 
12.3%
N47000
 
7.7%
L45800
 
7.5%
E44628
 
7.3%
O41931
 
6.9%
I36346
 
6.0%
T34118
 
5.6%
R33380
 
5.5%
C33249
 
5.5%
S30350
 
5.0%
Other values (16)187846
30.8%
Decimal Number
ValueCountFrequency (%)
183
17.9%
061
13.1%
260
12.9%
553
11.4%
345
9.7%
437
8.0%
734
7.3%
633
 
7.1%
831
 
6.7%
927
 
5.8%
Other Punctuation
ValueCountFrequency (%)
,79108
98.2%
.1211
 
1.5%
'166
 
0.2%
?73
 
0.1%
/29
 
< 0.1%
#3
 
< 0.1%
§2
 
< 0.1%
&1
 
< 0.1%
Control
ValueCountFrequency (%)
œ2
40.0%
2
40.0%
1
20.0%
Space Separator
ValueCountFrequency (%)
101630
100.0%
Dash Punctuation
ValueCountFrequency (%)
-216
100.0%
Open Punctuation
ValueCountFrequency (%)
(12
100.0%
Close Punctuation
ValueCountFrequency (%)
)12
100.0%
Format
ValueCountFrequency (%)
­9
100.0%
Modifier Symbol
ValueCountFrequency (%)
`1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin842405
82.2%
Common182942
 
17.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
A74922
 
8.9%
N47000
 
5.6%
L45800
 
5.4%
E44628
 
5.3%
O41931
 
5.0%
I36346
 
4.3%
T34118
 
4.1%
R33380
 
4.0%
C33249
 
3.9%
S30350
 
3.6%
Other values (43)420681
49.9%
Common
ValueCountFrequency (%)
101630
55.6%
,79108
43.2%
.1211
 
0.7%
-216
 
0.1%
'166
 
0.1%
183
 
< 0.1%
?73
 
< 0.1%
061
 
< 0.1%
260
 
< 0.1%
553
 
< 0.1%
Other values (17)281
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII1025332
> 99.9%
Latin 1 Sup15
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
101630
 
9.9%
,79108
 
7.7%
A74922
 
7.3%
N47000
 
4.6%
L45800
 
4.5%
E44628
 
4.4%
O41931
 
4.1%
I36346
 
3.5%
T34118
 
3.3%
R33380
 
3.3%
Other values (65)486469
47.4%
Latin 1 Sup
ValueCountFrequency (%)
­9
60.0%
§2
 
13.3%
œ2
 
13.3%
1
 
6.7%
ñ1
 
6.7%

Country
Categorical

HIGH CARDINALITY

Distinct177
Distinct (%)0.2%
Missing507
Missing (%)0.6%
Memory size619.6 KiB
United States
74734 
Canada
 
256
Brazil
 
220
United Kingdom
 
216
Mexico
 
210
Other values (172)
 
3150

Length

Max length30
Median length13
Mean length12.74399005
Min length4

Characters and Unicode

Total characters1004048
Distinct characters54
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique33 ?
Unique (%)< 0.1%

Sample

1st rowUnited States
2nd rowUnited States
3rd rowUnited States
4th rowUnited States
5th rowUnited States

Common Values

ValueCountFrequency (%)
United States74734
94.3%
Canada256
 
0.3%
Brazil220
 
0.3%
United Kingdom216
 
0.3%
Mexico210
 
0.3%
Australia196
 
0.2%
Bahamas191
 
0.2%
France164
 
0.2%
Germany158
 
0.2%
Colombia134
 
0.2%
Other values (167)2307
 
2.9%
(Missing)507
 
0.6%

Length

2021-09-06T13:01:43.959517image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
united74960
48.6%
states74738
48.5%
canada256
 
0.2%
brazil220
 
0.1%
kingdom216
 
0.1%
mexico210
 
0.1%
australia196
 
0.1%
bahamas191
 
0.1%
france164
 
0.1%
germany158
 
0.1%
Other values (198)2920
 
1.9%

Most occurring characters

ValueCountFrequency (%)
t225289
22.4%
e151786
15.1%
a79938
 
8.0%
i77483
 
7.7%
n77477
 
7.7%
d76140
 
7.6%
s75767
 
7.5%
75443
 
7.5%
S75062
 
7.5%
U74994
 
7.5%
Other values (44)14669
 
1.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter774341
77.1%
Uppercase Letter154228
 
15.4%
Space Separator75443
 
7.5%
Other Punctuation36
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t225289
29.1%
e151786
19.6%
a79938
 
10.3%
i77483
 
10.0%
n77477
 
10.0%
d76140
 
9.8%
s75767
 
9.8%
r1666
 
0.2%
l1505
 
0.2%
o1340
 
0.2%
Other values (16)5950
 
0.8%
Uppercase Letter
ValueCountFrequency (%)
S75062
48.7%
U74994
48.6%
C572
 
0.4%
B510
 
0.3%
A470
 
0.3%
I327
 
0.2%
K282
 
0.2%
G276
 
0.2%
M270
 
0.2%
P219
 
0.1%
Other values (15)1246
 
0.8%
Other Punctuation
ValueCountFrequency (%)
,34
94.4%
'2
 
5.6%
Space Separator
ValueCountFrequency (%)
75443
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin928569
92.5%
Common75479
 
7.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
t225289
24.3%
e151786
16.3%
a79938
 
8.6%
i77483
 
8.3%
n77477
 
8.3%
d76140
 
8.2%
s75767
 
8.2%
S75062
 
8.1%
U74994
 
8.1%
r1666
 
0.2%
Other values (41)12967
 
1.4%
Common
ValueCountFrequency (%)
75443
> 99.9%
,34
 
< 0.1%
'2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII1004048
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t225289
22.4%
e151786
15.1%
a79938
 
8.0%
i77483
 
7.7%
n77477
 
7.7%
d76140
 
7.6%
s75767
 
7.5%
75443
 
7.5%
S75062
 
7.5%
U74994
 
7.5%
Other values (44)14669
 
1.5%

Latitude
Real number (ℝ)

HIGH CORRELATION
MISSING

Distinct17665
Distinct (%)68.6%
Missing53542
Missing (%)67.5%
Infinite0
Infinite (%)0.0%
Mean37.69042077
Minimum-78.016945
Maximum89.218056
Zeros10
Zeros (%)< 0.1%
Negative465
Negative (%)0.6%
Memory size619.6 KiB
2021-09-06T13:01:44.144296image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-78.016945
5-th percentile25.916945
Q133.379445
median38.184166
Q342.566528
95-th percentile60.417917
Maximum89.218056
Range167.235001
Interquartile range (IQR)9.187083

Descriptive statistics

Standard deviation12.14801904
Coefficient of variation (CV)0.3223105179
Kurtosis11.61835217
Mean37.69042077
Median Absolute Deviation (MAD)4.619444
Skewness-1.96396679
Sum970566.0253
Variance147.5743666
MonotonicityNot monotonic
2021-09-06T13:01:44.213660image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
33.46083328
 
< 0.1%
32.82611123
 
< 0.1%
33.68833322
 
< 0.1%
61.21361121
 
< 0.1%
32.81555620
 
< 0.1%
26.19722219
 
< 0.1%
34.65444419
 
< 0.1%
33.26916718
 
< 0.1%
43.98444417
 
< 0.1%
33.87555617
 
< 0.1%
Other values (17655)25547
32.2%
(Missing)53542
67.5%
ValueCountFrequency (%)
-78.0169451
< 0.1%
-77.8333331
< 0.1%
-771
< 0.1%
-61.8833341
< 0.1%
-48.5713891
< 0.1%
-46.0666671
< 0.1%
-45.5666671
< 0.1%
-45.2833341
< 0.1%
-45.2022221
< 0.1%
-45.1563891
< 0.1%
ValueCountFrequency (%)
89.2180561
< 0.1%
87.1458331
< 0.1%
86.9472221
< 0.1%
84.6372221
< 0.1%
84.5716671
< 0.1%
81.8383331
< 0.1%
80.5822231
< 0.1%
77.3511112
< 0.1%
77.0072221
< 0.1%
76.3041671
< 0.1%

Longitude
Real number (ℝ)

HIGH CORRELATION
MISSING

Distinct18925
Distinct (%)73.5%
Missing53551
Missing (%)67.5%
Infinite0
Infinite (%)0.0%
Mean-93.78106079
Minimum-178.676111
Maximum177.557778
Zeros9
Zeros (%)< 0.1%
Negative24892
Negative (%)31.4%
Memory size619.6 KiB
2021-09-06T13:01:44.286530image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum-178.676111
5-th percentile-149.286389
Q1-115.0085417
median-94.498055
Q3-81.72583375
95-th percentile-66.01951445
Maximum177.557778
Range356.233889
Interquartile range (IQR)33.282708

Descriptive statistics

Standard deviation39.24366214
Coefficient of variation (CV)-0.4184604206
Kurtosis15.0216894
Mean-93.78106079
Median Absolute Deviation (MAD)15.385139
Skewness3.074226433
Sum-2414112.067
Variance1540.065018
MonotonicityNot monotonic
2021-09-06T13:01:44.355316image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-112.082532
 
< 0.1%
-111.72833429
 
< 0.1%
-111.81111124
 
< 0.1%
-104.67305622
 
< 0.1%
-149.84444421
 
< 0.1%
-88.55694420
 
< 0.1%
-117.13944418
 
< 0.1%
-116.972518
 
< 0.1%
-106.60916717
 
< 0.1%
-121.81972316
 
< 0.1%
Other values (18915)25525
32.2%
(Missing)53551
67.5%
ValueCountFrequency (%)
-178.6761111
 
< 0.1%
-175.3730561
 
< 0.1%
-174.2966661
 
< 0.1%
-173.241
 
< 0.1%
-170.7113893
< 0.1%
-170.4888891
 
< 0.1%
-169.5352771
 
< 0.1%
-168.3627781
 
< 0.1%
-168.241
 
< 0.1%
-167.8722231
 
< 0.1%
ValueCountFrequency (%)
177.5577781
< 0.1%
177.3741671
< 0.1%
176.2166671
< 0.1%
176.0713891
< 0.1%
175.5833331
< 0.1%
175.251
< 0.1%
174.7666671
< 0.1%
174.3333331
< 0.1%
174.3002781
< 0.1%
174.21
< 0.1%

Airport.Code
Categorical

HIGH CARDINALITY
MISSING

Distinct9630
Distinct (%)21.6%
Missing34630
Missing (%)43.7%
Memory size619.6 KiB
NONE
 
1466
PVT
 
356
ORD
 
146
APA
 
142
MRI
 
127
Other values (9625)
42426 

Length

Max length8
Median length3
Mean length3.159595191
Min length1

Characters and Unicode

Total characters141117
Distinct characters70
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4398 ?
Unique (%)9.8%

Sample

1st rowFAM
2nd rowE79
3rd rowGKT
4th row6B0
5th row10G

Common Values

ValueCountFrequency (%)
NONE1466
 
1.8%
PVT356
 
0.4%
ORD146
 
0.2%
APA142
 
0.2%
MRI127
 
0.2%
DEN113
 
0.1%
OSH93
 
0.1%
FFZ89
 
0.1%
VNY89
 
0.1%
BJC87
 
0.1%
Other values (9620)41955
52.9%
(Missing)34630
43.7%

Length

2021-09-06T13:01:44.519130image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none1553
 
3.5%
pvt363
 
0.8%
ord146
 
0.3%
apa142
 
0.3%
mri127
 
0.3%
den113
 
0.3%
osh93
 
0.2%
ffz89
 
0.2%
vny89
 
0.2%
bjc87
 
0.2%
Other values (9596)41866
93.7%

Most occurring characters

ValueCountFrequency (%)
N7815
 
5.5%
A7115
 
5.0%
S6366
 
4.5%
O6071
 
4.3%
L5795
 
4.1%
C5418
 
3.8%
E5216
 
3.7%
M5069
 
3.6%
T4861
 
3.4%
K4711
 
3.3%
Other values (60)82680
58.6%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter107748
76.4%
Decimal Number32836
 
23.3%
Lowercase Letter354
 
0.3%
Dash Punctuation87
 
0.1%
Other Punctuation46
 
< 0.1%
Math Symbol38
 
< 0.1%
Space Separator5
 
< 0.1%
Currency Symbol1
 
< 0.1%
Close Punctuation1
 
< 0.1%
Control1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N7815
 
7.3%
A7115
 
6.6%
S6366
 
5.9%
O6071
 
5.6%
L5795
 
5.4%
C5418
 
5.0%
E5216
 
4.8%
M5069
 
4.7%
T4861
 
4.5%
K4711
 
4.4%
Other values (16)49311
45.8%
Lowercase Letter
ValueCountFrequency (%)
n98
27.7%
e91
25.7%
o88
24.9%
v9
 
2.5%
t9
 
2.5%
p9
 
2.5%
i7
 
2.0%
l6
 
1.7%
x5
 
1.4%
a4
 
1.1%
Other values (12)28
 
7.9%
Decimal Number
ValueCountFrequency (%)
14086
12.4%
23794
11.6%
03586
10.9%
33534
10.8%
43331
10.1%
53160
9.6%
73109
9.5%
62961
9.0%
82724
8.3%
92551
7.8%
Other Punctuation
ValueCountFrequency (%)
.39
84.8%
/3
 
6.5%
\1
 
2.2%
*1
 
2.2%
;1
 
2.2%
&1
 
2.2%
Dash Punctuation
ValueCountFrequency (%)
-87
100.0%
Math Symbol
ValueCountFrequency (%)
+38
100.0%
Currency Symbol
ValueCountFrequency (%)
$1
100.0%
Close Punctuation
ValueCountFrequency (%)
)1
100.0%
Control
ValueCountFrequency (%)
1
100.0%
Space Separator
ValueCountFrequency (%)
5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin108102
76.6%
Common33015
 
23.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
N7815
 
7.2%
A7115
 
6.6%
S6366
 
5.9%
O6071
 
5.6%
L5795
 
5.4%
C5418
 
5.0%
E5216
 
4.8%
M5069
 
4.7%
T4861
 
4.5%
K4711
 
4.4%
Other values (38)49665
45.9%
Common
ValueCountFrequency (%)
14086
12.4%
23794
11.5%
03586
10.9%
33534
10.7%
43331
10.1%
53160
9.6%
73109
9.4%
62961
9.0%
82724
8.3%
92551
7.7%
Other values (12)179
 
0.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII141117
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N7815
 
5.5%
A7115
 
5.0%
S6366
 
4.5%
O6071
 
4.3%
L5795
 
4.1%
C5418
 
3.8%
E5216
 
3.7%
M5069
 
3.6%
T4861
 
3.4%
K4711
 
3.3%
Other values (60)82680
58.6%

Airport.Name
Categorical

HIGH CARDINALITY
MISSING

Distinct22760
Distinct (%)48.0%
Missing31857
Missing (%)40.2%
Memory size619.6 KiB
PRIVATE
 
216
Private
 
173
NONE
 
140
Private Airstrip
 
135
PRIVATE STRIP
 
111
Other values (22755)
46661 

Length

Max length33
Median length15
Mean length15.5105616
Min length1

Characters and Unicode

Total characters735759
Distinct characters84
Distinct categories11 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique15230 ?
Unique (%)32.1%

Sample

1st rowFARMINGTON RGNL
2nd rowSierra Sky Park
3rd rowGATLINBURG-PIGEON FORGE
4th rowMIDDLEBURY STATE
5th rowHolmes County

Common Values

ValueCountFrequency (%)
PRIVATE216
 
0.3%
Private173
 
0.2%
NONE140
 
0.2%
Private Airstrip135
 
0.2%
PRIVATE STRIP111
 
0.1%
PRIVATE AIRSTRIP90
 
0.1%
None84
 
0.1%
MERRILL FIELD78
 
0.1%
MUNICIPAL78
 
0.1%
VAN NUYS74
 
0.1%
Other values (22750)46257
58.3%
(Missing)31857
40.2%

Length

2021-09-06T13:01:44.682822image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
airport7795
 
7.6%
municipal4241
 
4.1%
county3408
 
3.3%
field3078
 
3.0%
muni1741
 
1.7%
regional1666
 
1.6%
international1653
 
1.6%
private1067
 
1.0%
lake923
 
0.9%
intl897
 
0.9%
Other values (10048)76380
74.3%

Most occurring characters

ValueCountFrequency (%)
55413
 
7.5%
A49317
 
6.7%
E36658
 
5.0%
I36135
 
4.9%
N35653
 
4.8%
R34018
 
4.6%
L32166
 
4.4%
O28928
 
3.9%
r27025
 
3.7%
T26612
 
3.6%
Other values (74)373834
50.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter446490
60.7%
Lowercase Letter226095
30.7%
Space Separator55413
 
7.5%
Other Punctuation4190
 
0.6%
Dash Punctuation2975
 
0.4%
Decimal Number268
 
< 0.1%
Open Punctuation162
 
< 0.1%
Close Punctuation154
 
< 0.1%
Control8
 
< 0.1%
Math Symbol3
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
r27025
12.0%
i24786
11.0%
a20982
9.3%
o20751
9.2%
e19954
8.8%
n19386
8.6%
t19029
8.4%
l15244
 
6.7%
p11071
 
4.9%
u7348
 
3.2%
Other values (17)40519
17.9%
Uppercase Letter
ValueCountFrequency (%)
A49317
 
11.0%
E36658
 
8.2%
I36135
 
8.1%
N35653
 
8.0%
R34018
 
7.6%
L32166
 
7.2%
O28928
 
6.5%
T26612
 
6.0%
S20477
 
4.6%
C19928
 
4.5%
Other values (16)126598
28.4%
Other Punctuation
ValueCountFrequency (%)
.2098
50.1%
'998
23.8%
/792
 
18.9%
,175
 
4.2%
?61
 
1.5%
&33
 
0.8%
"16
 
0.4%
#14
 
0.3%
;2
 
< 0.1%
\1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
267
25.0%
137
13.8%
336
13.4%
625
 
9.3%
425
 
9.3%
019
 
7.1%
718
 
6.7%
516
 
6.0%
814
 
5.2%
911
 
4.1%
Control
ValueCountFrequency (%)
ƒ5
62.5%
œ2
 
25.0%
1
 
12.5%
Open Punctuation
ValueCountFrequency (%)
(159
98.1%
[3
 
1.9%
Close Punctuation
ValueCountFrequency (%)
)152
98.7%
]2
 
1.3%
Space Separator
ValueCountFrequency (%)
55413
100.0%
Dash Punctuation
ValueCountFrequency (%)
-2975
100.0%
Math Symbol
ValueCountFrequency (%)
¬3
100.0%
Format
ValueCountFrequency (%)
­1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin672585
91.4%
Common63174
 
8.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
A49317
 
7.3%
E36658
 
5.5%
I36135
 
5.4%
N35653
 
5.3%
R34018
 
5.1%
L32166
 
4.8%
O28928
 
4.3%
r27025
 
4.0%
T26612
 
4.0%
i24786
 
3.7%
Other values (43)341287
50.7%
Common
ValueCountFrequency (%)
55413
87.7%
-2975
 
4.7%
.2098
 
3.3%
'998
 
1.6%
/792
 
1.3%
,175
 
0.3%
(159
 
0.3%
)152
 
0.2%
267
 
0.1%
?61
 
0.1%
Other values (21)284
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII735747
> 99.9%
Latin 1 Sup12
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
55413
 
7.5%
A49317
 
6.7%
E36658
 
5.0%
I36135
 
4.9%
N35653
 
4.8%
R34018
 
4.6%
L32166
 
4.4%
O28928
 
3.9%
r27025
 
3.7%
T26612
 
3.6%
Other values (69)373822
50.8%
Latin 1 Sup
ValueCountFrequency (%)
ƒ5
41.7%
¬3
25.0%
œ2
 
16.7%
­1
 
8.3%
ñ1
 
8.3%

Injury.Severity
Categorical

HIGH CARDINALITY

Distinct124
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Memory size619.6 KiB
Non-Fatal
60025 
Fatal(1)
7830 
Fatal(2)
 
4618
Incident
 
3175
Fatal(3)
 
1450
Other values (119)
 
2195

Length

Max length11
Median length9
Mean length8.769513072
Min length8

Characters and Unicode

Total characters695361
Distinct characters28
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique67 ?
Unique (%)0.1%

Sample

1st rowNon-Fatal
2nd rowFatal(4)
3rd rowNon-Fatal
4th rowNon-Fatal
5th rowFatal(2)

Common Values

ValueCountFrequency (%)
Non-Fatal60025
75.7%
Fatal(1)7830
 
9.9%
Fatal(2)4618
 
5.8%
Incident3175
 
4.0%
Fatal(3)1450
 
1.8%
Fatal(4)1012
 
1.3%
Fatal(5)311
 
0.4%
Unavailable220
 
0.3%
Fatal(6)196
 
0.2%
Fatal(7)83
 
0.1%
Other values (114)373
 
0.5%

Length

2021-09-06T13:01:44.834227image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
non-fatal60025
75.7%
fatal(17830
 
9.9%
fatal(24618
 
5.8%
incident3175
 
4.0%
fatal(31450
 
1.8%
fatal(41012
 
1.3%
fatal(5311
 
0.4%
unavailable220
 
0.3%
fatal(6196
 
0.2%
fatal(783
 
0.1%
Other values (114)373
 
0.5%

Most occurring characters

ValueCountFrequency (%)
a152456
21.9%
t79073
11.4%
l76338
11.0%
F75898
10.9%
n66595
9.6%
N60025
 
8.6%
o60025
 
8.6%
-60025
 
8.6%
(15873
 
2.3%
)15873
 
2.3%
Other values (18)33180
 
4.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter448067
64.4%
Uppercase Letter139318
 
20.0%
Dash Punctuation60025
 
8.6%
Decimal Number16205
 
2.3%
Open Punctuation15873
 
2.3%
Close Punctuation15873
 
2.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a152456
34.0%
t79073
17.6%
l76338
17.0%
n66595
14.9%
o60025
 
13.4%
i3395
 
0.8%
e3395
 
0.8%
c3175
 
0.7%
d3175
 
0.7%
v220
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
18034
49.6%
24692
29.0%
31498
 
9.2%
41064
 
6.6%
5354
 
2.2%
6222
 
1.4%
7115
 
0.7%
894
 
0.6%
071
 
0.4%
961
 
0.4%
Uppercase Letter
ValueCountFrequency (%)
F75898
54.5%
N60025
43.1%
I3175
 
2.3%
U220
 
0.2%
Dash Punctuation
ValueCountFrequency (%)
-60025
100.0%
Open Punctuation
ValueCountFrequency (%)
(15873
100.0%
Close Punctuation
ValueCountFrequency (%)
)15873
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin587385
84.5%
Common107976
 
15.5%

Most frequent character per script

Latin
ValueCountFrequency (%)
a152456
26.0%
t79073
13.5%
l76338
13.0%
F75898
12.9%
n66595
11.3%
N60025
 
10.2%
o60025
 
10.2%
i3395
 
0.6%
e3395
 
0.6%
I3175
 
0.5%
Other values (5)7010
 
1.2%
Common
ValueCountFrequency (%)
-60025
55.6%
(15873
 
14.7%
)15873
 
14.7%
18034
 
7.4%
24692
 
4.3%
31498
 
1.4%
41064
 
1.0%
5354
 
0.3%
6222
 
0.2%
7115
 
0.1%
Other values (3)226
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII695361
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
a152456
21.9%
t79073
11.4%
l76338
11.0%
F75898
10.9%
n66595
9.6%
N60025
 
8.6%
o60025
 
8.6%
-60025
 
8.6%
(15873
 
2.3%
)15873
 
2.3%
Other values (18)33180
 
4.8%

Aircraft.Damage
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct3
Distinct (%)< 0.1%
Missing2410
Missing (%)3.0%
Memory size619.6 KiB
Substantial
57049 
Destroyed
17322 
Minor
 
2512

Length

Max length11
Median length11
Mean length10.3533551
Min length5

Characters and Unicode

Total characters795997
Distinct characters16
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowSubstantial
2nd rowSubstantial
3rd rowSubstantial
4th rowSubstantial
5th rowDestroyed

Common Values

ValueCountFrequency (%)
Substantial57049
71.9%
Destroyed17322
 
21.8%
Minor2512
 
3.2%
(Missing)2410
 
3.0%

Length

2021-09-06T13:01:44.959646image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-06T13:01:45.002714image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
substantial57049
74.2%
destroyed17322
 
22.5%
minor2512
 
3.3%

Most occurring characters

ValueCountFrequency (%)
t131420
16.5%
a114098
14.3%
s74371
9.3%
n59561
7.5%
i59561
7.5%
S57049
7.2%
u57049
7.2%
b57049
7.2%
l57049
7.2%
e34644
 
4.4%
Other values (6)94146
11.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter719114
90.3%
Uppercase Letter76883
 
9.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
t131420
18.3%
a114098
15.9%
s74371
10.3%
n59561
8.3%
i59561
8.3%
u57049
7.9%
b57049
7.9%
l57049
7.9%
e34644
 
4.8%
r19834
 
2.8%
Other values (3)54478
7.6%
Uppercase Letter
ValueCountFrequency (%)
S57049
74.2%
D17322
 
22.5%
M2512
 
3.3%

Most occurring scripts

ValueCountFrequency (%)
Latin795997
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
t131420
16.5%
a114098
14.3%
s74371
9.3%
n59561
7.5%
i59561
7.5%
S57049
7.2%
u57049
7.2%
b57049
7.2%
l57049
7.2%
e34644
 
4.4%
Other values (6)94146
11.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII795997
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
t131420
16.5%
a114098
14.3%
s74371
9.3%
n59561
7.5%
i59561
7.5%
S57049
7.2%
u57049
7.2%
b57049
7.2%
l57049
7.2%
e34644
 
4.4%
Other values (6)94146
11.8%

Aircraft.Category
Categorical

HIGH CORRELATION
MISSING

Distinct13
Distinct (%)0.1%
Missing56816
Missing (%)71.7%
Memory size619.6 KiB
Airplane
19273 
Helicopter
2360 
Glider
 
381
Balloon
 
175
Gyrocraft
 
100
Other values (8)
 
188

Length

Max length17
Median length8
Mean length8.205543444
Min length5

Characters and Unicode

Total characters184436
Distinct characters31
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1 ?
Unique (%)< 0.1%

Sample

1st rowAirplane
2nd rowAirplane
3rd rowAirplane
4th rowAirplane
5th rowAirplane

Common Values

ValueCountFrequency (%)
Airplane19273
 
24.3%
Helicopter2360
 
3.0%
Glider381
 
0.5%
Balloon175
 
0.2%
Gyrocraft100
 
0.1%
Weight-Shift66
 
0.1%
Powered Parachute48
 
0.1%
Unknown32
 
< 0.1%
Ultralight31
 
< 0.1%
Powered-Lift5
 
< 0.1%
Other values (3)6
 
< 0.1%
(Missing)56816
71.7%

Length

2021-09-06T13:01:45.121046image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
airplane19273
85.6%
helicopter2360
 
10.5%
glider381
 
1.7%
balloon175
 
0.8%
gyrocraft100
 
0.4%
weight-shift66
 
0.3%
parachute48
 
0.2%
powered48
 
0.2%
unknown32
 
0.1%
ultralight31
 
0.1%
Other values (4)11
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
e24597
13.3%
l22431
12.2%
r22348
12.1%
i22185
12.0%
p21638
11.7%
a19677
10.7%
n19546
10.6%
A19273
10.4%
o2898
 
1.6%
t2708
 
1.5%
Other values (21)7135
 
3.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter161721
87.7%
Uppercase Letter22596
 
12.3%
Dash Punctuation71
 
< 0.1%
Space Separator48
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e24597
15.2%
l22431
13.9%
r22348
13.8%
i22185
13.7%
p21638
13.4%
a19677
12.2%
n19546
12.1%
o2898
 
1.8%
t2708
 
1.7%
c2509
 
1.6%
Other values (9)1184
 
0.7%
Uppercase Letter
ValueCountFrequency (%)
A19273
85.3%
H2360
 
10.4%
G483
 
2.1%
B178
 
0.8%
P101
 
0.4%
W66
 
0.3%
S66
 
0.3%
U63
 
0.3%
L5
 
< 0.1%
R1
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
-71
100.0%
Space Separator
ValueCountFrequency (%)
48
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin184317
99.9%
Common119
 
0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e24597
13.3%
l22431
12.2%
r22348
12.1%
i22185
12.0%
p21638
11.7%
a19677
10.7%
n19546
10.6%
A19273
10.5%
o2898
 
1.6%
t2708
 
1.5%
Other values (19)7016
 
3.8%
Common
ValueCountFrequency (%)
-71
59.7%
48
40.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII184436
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e24597
13.3%
l22431
12.2%
r22348
12.1%
i22185
12.0%
p21638
11.7%
a19677
10.7%
n19546
10.6%
A19273
10.4%
o2898
 
1.6%
t2708
 
1.5%
Other values (21)7135
 
3.9%

Registration.Number
Categorical

HIGH CARDINALITY
MISSING

Distinct68960
Distinct (%)90.5%
Missing3084
Missing (%)3.9%
Memory size619.6 KiB
NONE
 
365
None
 
103
USAF
 
9
N20752
 
8
UNK
 
7
Other values (68955)
75717 

Length

Max length11
Median length6
Mean length5.827684394
Min length3

Characters and Unicode

Total characters444122
Distinct characters56
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique62893 ?
Unique (%)82.5%

Sample

1st rowN710XP
2nd rowN52388
3rd rowN5499Z
4th rowN918KS
5th rowN176PA

Common Values

ValueCountFrequency (%)
NONE365
 
0.5%
None103
 
0.1%
USAF9
 
< 0.1%
N207528
 
< 0.1%
UNK7
 
< 0.1%
N8402K6
 
< 0.1%
N538936
 
< 0.1%
N4101E6
 
< 0.1%
N121CC6
 
< 0.1%
N11VH6
 
< 0.1%
Other values (68950)75687
95.5%
(Missing)3084
 
3.9%

Length

2021-09-06T13:01:45.275069image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
none471
 
0.6%
unreg12
 
< 0.1%
usaf9
 
< 0.1%
n207528
 
< 0.1%
unk7
 
< 0.1%
n8402k6
 
< 0.1%
n121cc6
 
< 0.1%
n4101e6
 
< 0.1%
n11vh6
 
< 0.1%
n538936
 
< 0.1%
Other values (68942)75673
99.3%

Most occurring characters

ValueCountFrequency (%)
N78370
17.6%
131991
 
7.2%
230751
 
6.9%
329295
 
6.6%
529027
 
6.5%
428846
 
6.5%
627910
 
6.3%
727615
 
6.2%
926878
 
6.1%
826640
 
6.0%
Other values (46)106799
24.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number282165
63.5%
Uppercase Letter161355
36.3%
Lowercase Letter436
 
0.1%
Dash Punctuation161
 
< 0.1%
Other Punctuation4
 
< 0.1%
Space Separator1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N78370
48.6%
A5712
 
3.5%
C4926
 
3.1%
S4824
 
3.0%
D4281
 
2.7%
M4276
 
2.7%
B4253
 
2.6%
P4171
 
2.6%
R3972
 
2.5%
T3947
 
2.4%
Other values (16)42623
26.4%
Lowercase Letter
ValueCountFrequency (%)
n130
29.8%
e128
29.4%
o108
24.8%
r19
 
4.4%
g13
 
3.0%
t7
 
1.6%
i5
 
1.1%
s5
 
1.1%
u4
 
0.9%
k4
 
0.9%
Other values (6)13
 
3.0%
Decimal Number
ValueCountFrequency (%)
131991
11.3%
230751
10.9%
329295
10.4%
529027
10.3%
428846
10.2%
627910
9.9%
727615
9.8%
926878
9.5%
826640
9.4%
023212
8.2%
Other Punctuation
ValueCountFrequency (%)
*3
75.0%
.1
 
25.0%
Dash Punctuation
ValueCountFrequency (%)
-161
100.0%
Space Separator
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common282331
63.6%
Latin161791
36.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
N78370
48.4%
A5712
 
3.5%
C4926
 
3.0%
S4824
 
3.0%
D4281
 
2.6%
M4276
 
2.6%
B4253
 
2.6%
P4171
 
2.6%
R3972
 
2.5%
T3947
 
2.4%
Other values (32)43059
26.6%
Common
ValueCountFrequency (%)
131991
11.3%
230751
10.9%
329295
10.4%
529027
10.3%
428846
10.2%
627910
9.9%
727615
9.8%
926878
9.5%
826640
9.4%
023212
8.2%
Other values (4)166
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII444122
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N78370
17.6%
131991
 
7.2%
230751
 
6.9%
329295
 
6.6%
529027
 
6.5%
428846
 
6.5%
627910
 
6.3%
727615
 
6.2%
926878
 
6.1%
826640
 
6.0%
Other values (46)106799
24.0%

Make
Categorical

HIGH CARDINALITY

Distinct7475
Distinct (%)9.4%
Missing89
Missing (%)0.1%
Memory size619.6 KiB
CESSNA
17105 
PIPER
9433 
Cessna
7742 
Piper
4096 
BEECH
 
3133
Other values (7470)
37695 

Length

Max length33
Median length6
Mean length7.494432099
Min length2

Characters and Unicode

Total characters593589
Distinct characters73
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique6080 ?
Unique (%)7.7%

Sample

1st rowSOFTEX INVEST LLC
2nd rowCESSNA
3rd rowPIPER
4th rowMEAD
5th rowPETRUS DAVID WAYNE

Common Values

ValueCountFrequency (%)
CESSNA17105
21.6%
PIPER9433
 
11.9%
Cessna7742
 
9.8%
Piper4096
 
5.2%
BEECH3133
 
4.0%
Beech1748
 
2.2%
BELL1579
 
2.0%
BOEING1372
 
1.7%
GRUMMAN907
 
1.1%
Bell888
 
1.1%
Other values (7465)31201
39.3%

Length

2021-09-06T13:01:45.442628image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
cessna24888
26.2%
piper13569
 
14.3%
beech4889
 
5.2%
bell2514
 
2.7%
boeing2231
 
2.4%
grumman1456
 
1.5%
robinson1305
 
1.4%
mooney1249
 
1.3%
bellanca989
 
1.0%
american956
 
1.0%
Other values (5893)40779
43.0%

Most occurring characters

ValueCountFrequency (%)
E54190
 
9.1%
S45822
 
7.7%
C40118
 
6.8%
A38648
 
6.5%
N32871
 
5.5%
e28798
 
4.9%
R28322
 
4.8%
P26795
 
4.5%
I23136
 
3.9%
s21055
 
3.5%
Other values (63)253834
42.8%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter396119
66.7%
Lowercase Letter177573
29.9%
Space Separator15621
 
2.6%
Other Punctuation2496
 
0.4%
Dash Punctuation989
 
0.2%
Open Punctuation338
 
0.1%
Close Punctuation336
 
0.1%
Decimal Number112
 
< 0.1%
Math Symbol5
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
E54190
13.7%
S45822
11.6%
C40118
10.1%
A38648
9.8%
N32871
8.3%
R28322
 
7.1%
P26795
 
6.8%
I23136
 
5.8%
O15673
 
4.0%
B15093
 
3.8%
Other values (16)75451
19.0%
Lowercase Letter
ValueCountFrequency (%)
e28798
16.2%
s21055
11.9%
a18143
10.2%
n17631
9.9%
r15963
9.0%
i12784
7.2%
o10843
 
6.1%
l8309
 
4.7%
c7218
 
4.1%
p5897
 
3.3%
Other values (16)30932
17.4%
Decimal Number
ValueCountFrequency (%)
120
17.9%
717
15.2%
017
15.2%
213
11.6%
510
8.9%
39
8.0%
69
8.0%
87
 
6.2%
46
 
5.4%
94
 
3.6%
Other Punctuation
ValueCountFrequency (%)
.1564
62.7%
,321
 
12.9%
/283
 
11.3%
&271
 
10.9%
'35
 
1.4%
?22
 
0.9%
Space Separator
ValueCountFrequency (%)
15621
100.0%
Dash Punctuation
ValueCountFrequency (%)
-989
100.0%
Open Punctuation
ValueCountFrequency (%)
(338
100.0%
Close Punctuation
ValueCountFrequency (%)
)336
100.0%
Math Symbol
ValueCountFrequency (%)
+5
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin573692
96.6%
Common19897
 
3.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
E54190
 
9.4%
S45822
 
8.0%
C40118
 
7.0%
A38648
 
6.7%
N32871
 
5.7%
e28798
 
5.0%
R28322
 
4.9%
P26795
 
4.7%
I23136
 
4.0%
s21055
 
3.7%
Other values (42)233937
40.8%
Common
ValueCountFrequency (%)
15621
78.5%
.1564
 
7.9%
-989
 
5.0%
(338
 
1.7%
)336
 
1.7%
,321
 
1.6%
/283
 
1.4%
&271
 
1.4%
'35
 
0.2%
?22
 
0.1%
Other values (11)117
 
0.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII593589
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
E54190
 
9.1%
S45822
 
7.7%
C40118
 
6.8%
A38648
 
6.5%
N32871
 
5.5%
e28798
 
4.9%
R28322
 
4.8%
P26795
 
4.5%
I23136
 
3.9%
s21055
 
3.5%
Other values (63)253834
42.8%

Model
Categorical

HIGH CARDINALITY

Distinct11330
Distinct (%)14.3%
Missing118
Missing (%)0.1%
Memory size619.6 KiB
152
 
2278
172
 
1263
172N
 
1133
PA-28-140
 
900
172M
 
775
Other values (11325)
72826 

Length

Max length20
Median length5
Mean length5.828215977
Min length1

Characters and Unicode

Total characters461449
Distinct characters82
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique7068 ?
Unique (%)8.9%

Sample

1st rowV-24L
2nd row182
3rd rowPA22
4th rowRV 8A
5th rowS90

Common Values

ValueCountFrequency (%)
1522278
 
2.9%
1721263
 
1.6%
172N1133
 
1.4%
PA-28-140900
 
1.1%
172M775
 
1.0%
150725
 
0.9%
172P669
 
0.8%
150M581
 
0.7%
PA-18573
 
0.7%
PA-28-161558
 
0.7%
Other values (11320)69720
87.9%

Length

2021-09-06T13:01:45.596846image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
1522303
 
2.5%
1721329
 
1.5%
172n1135
 
1.2%
ii933
 
1.0%
pa-28-140901
 
1.0%
172m775
 
0.9%
150758
 
0.8%
172p672
 
0.7%
150m581
 
0.6%
pa-18577
 
0.6%
Other values (8845)80852
89.0%

Most occurring characters

ValueCountFrequency (%)
145859
 
9.9%
245323
 
9.8%
-43474
 
9.4%
033868
 
7.3%
A31738
 
6.9%
519580
 
4.2%
818527
 
4.0%
318109
 
3.9%
P17213
 
3.7%
717205
 
3.7%
Other values (72)170553
37.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number223981
48.5%
Uppercase Letter166764
36.1%
Dash Punctuation43474
 
9.4%
Lowercase Letter14723
 
3.2%
Space Separator11641
 
2.5%
Other Punctuation457
 
0.1%
Open Punctuation181
 
< 0.1%
Close Punctuation177
 
< 0.1%
Math Symbol50
 
< 0.1%
Control1
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A31738
19.0%
P17213
 
10.3%
R11139
 
6.7%
C10766
 
6.5%
B10571
 
6.3%
T8969
 
5.4%
S8853
 
5.3%
E7887
 
4.7%
I6869
 
4.1%
M6425
 
3.9%
Other values (16)46334
27.8%
Lowercase Letter
ValueCountFrequency (%)
a1806
12.3%
e1649
11.2%
r1497
10.2%
i1305
 
8.9%
t1184
 
8.0%
o1061
 
7.2%
n998
 
6.8%
l709
 
4.8%
s662
 
4.5%
c489
 
3.3%
Other values (16)3363
22.8%
Other Punctuation
ValueCountFrequency (%)
/252
55.1%
.150
32.8%
"18
 
3.9%
'17
 
3.7%
#6
 
1.3%
,6
 
1.3%
&4
 
0.9%
%1
 
0.2%
;1
 
0.2%
\1
 
0.2%
Decimal Number
ValueCountFrequency (%)
145859
20.5%
245323
20.2%
033868
15.1%
519580
8.7%
818527
8.3%
318109
 
8.1%
717205
 
7.7%
410970
 
4.9%
610949
 
4.9%
93591
 
1.6%
Open Punctuation
ValueCountFrequency (%)
(180
99.4%
[1
 
0.6%
Close Punctuation
ValueCountFrequency (%)
)176
99.4%
]1
 
0.6%
Math Symbol
ValueCountFrequency (%)
+48
96.0%
=2
 
4.0%
Dash Punctuation
ValueCountFrequency (%)
-43474
100.0%
Space Separator
ValueCountFrequency (%)
11641
100.0%
Control
ValueCountFrequency (%)
1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common279962
60.7%
Latin181487
39.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
A31738
17.5%
P17213
 
9.5%
R11139
 
6.1%
C10766
 
5.9%
B10571
 
5.8%
T8969
 
4.9%
S8853
 
4.9%
E7887
 
4.3%
I6869
 
3.8%
M6425
 
3.5%
Other values (42)61057
33.6%
Common
ValueCountFrequency (%)
145859
16.4%
245323
16.2%
-43474
15.5%
033868
12.1%
519580
7.0%
818527
6.6%
318109
 
6.5%
717205
 
6.1%
11641
 
4.2%
410970
 
3.9%
Other values (20)15406
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII461449
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
145859
 
9.9%
245323
 
9.8%
-43474
 
9.4%
033868
 
7.3%
A31738
 
6.9%
519580
 
4.2%
818527
 
4.0%
318109
 
3.9%
P17213
 
3.7%
717205
 
3.7%
Other values (72)170553
37.0%
Distinct2
Distinct (%)< 0.1%
Missing572
Missing (%)0.7%
Memory size155.0 KiB
False
71105 
True
7616 
(Missing)
 
572
ValueCountFrequency (%)
False71105
89.7%
True7616
 
9.6%
(Missing)572
 
0.7%
2021-09-06T13:01:45.646024image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Number.of.Engines
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct6
Distinct (%)< 0.1%
Missing4118
Missing (%)5.2%
Infinite0
Infinite (%)0.0%
Mean1.148054539
Minimum0
Maximum18
Zeros1143
Zeros (%)1.4%
Negative0
Negative (%)0.0%
Memory size619.6 KiB
2021-09-06T13:01:45.679762image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile1
Q11
median1
Q31
95-th percentile2
Maximum18
Range18
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0.4538474471
Coefficient of variation (CV)0.3953187166
Kurtosis34.95018101
Mean1.148054539
Median Absolute Deviation (MAD)0
Skewness3.091751648
Sum86305
Variance0.2059775052
MonotonicityNot monotonic
2021-09-06T13:01:45.729691image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=6)
ValueCountFrequency (%)
163082
79.6%
210057
 
12.7%
01143
 
1.4%
3477
 
0.6%
4415
 
0.5%
181
 
< 0.1%
(Missing)4118
 
5.2%
ValueCountFrequency (%)
01143
 
1.4%
163082
79.6%
210057
 
12.7%
3477
 
0.6%
4415
 
0.5%
181
 
< 0.1%
ValueCountFrequency (%)
181
 
< 0.1%
4415
 
0.5%
3477
 
0.6%
210057
 
12.7%
163082
79.6%
01143
 
1.4%

Engine.Type
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct14
Distinct (%)< 0.1%
Missing3374
Missing (%)4.3%
Memory size619.6 KiB
Reciprocating
64598 
Turbo Shaft
 
3305
Turbo Prop
 
3042
Turbo Fan
 
2226
Unknown
 
2052
Other values (9)
 
696

Length

Max length16
Median length13
Mean length12.47633662
Min length4

Characters and Unicode

Total characters947191
Distinct characters33
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique4 ?
Unique (%)< 0.1%

Sample

1st rowReciprocating
2nd rowReciprocating
3rd rowReciprocating
4th rowReciprocating
5th rowReciprocating

Common Values

ValueCountFrequency (%)
Reciprocating64598
81.5%
Turbo Shaft3305
 
4.2%
Turbo Prop3042
 
3.8%
Turbo Fan2226
 
2.8%
Unknown2052
 
2.6%
Turbo Jet678
 
0.9%
None6
 
< 0.1%
Electric3
 
< 0.1%
TF, TJ3
 
< 0.1%
REC, TJ, TJ2
 
< 0.1%
Other values (4)4
 
< 0.1%
(Missing)3374
 
4.3%

Length

2021-09-06T13:01:45.860318image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
reciprocating64598
75.8%
turbo9251
 
10.9%
shaft3305
 
3.9%
prop3042
 
3.6%
fan2226
 
2.6%
unknown2052
 
2.4%
jet678
 
0.8%
tj11
 
< 0.1%
rec7
 
< 0.1%
none6
 
< 0.1%
Other values (5)9
 
< 0.1%

Most occurring characters

ValueCountFrequency (%)
c129203
13.6%
i129200
13.6%
o78950
8.3%
r76895
8.1%
n72986
7.7%
a70129
7.4%
t68585
7.2%
p67640
7.1%
e65286
6.9%
R64606
6.8%
Other values (23)123711
13.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter852695
90.0%
Uppercase Letter85216
 
9.0%
Space Separator9266
 
1.0%
Other Punctuation14
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
c129203
15.2%
i129200
15.2%
o78950
9.3%
r76895
9.0%
n72986
8.6%
a70129
8.2%
t68585
8.0%
p67640
7.9%
e65286
7.7%
g64598
7.6%
Other values (9)29223
 
3.4%
Uppercase Letter
ValueCountFrequency (%)
R64606
75.8%
T9265
 
10.9%
S3305
 
3.9%
P3042
 
3.6%
F2229
 
2.6%
U2052
 
2.4%
J689
 
0.8%
E12
 
< 0.1%
C8
 
< 0.1%
N6
 
< 0.1%
Other values (2)2
 
< 0.1%
Space Separator
ValueCountFrequency (%)
9266
100.0%
Other Punctuation
ValueCountFrequency (%)
,14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin937911
99.0%
Common9280
 
1.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
c129203
13.8%
i129200
13.8%
o78950
8.4%
r76895
8.2%
n72986
7.8%
a70129
7.5%
t68585
7.3%
p67640
7.2%
e65286
7.0%
R64606
6.9%
Other values (21)114431
12.2%
Common
ValueCountFrequency (%)
9266
99.8%
,14
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII947191
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
c129203
13.6%
i129200
13.6%
o78950
8.3%
r76895
8.1%
n72986
7.7%
a70129
7.4%
t68585
7.2%
p67640
7.1%
e65286
6.9%
R64606
6.8%
Other values (23)123711
13.1%

FAR.Description
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct17
Distinct (%)0.1%
Missing56959
Missing (%)71.8%
Memory size619.6 KiB
Part 91: General Aviation
17958 
Part 137: Agricultural
 
1104
Non-U.S., Non-Commercial
 
771
Part 135: Air Taxi & Commuter
 
763
Part 121: Air Carrier
 
525
Other values (12)
 
1213

Length

Max length33
Median length25
Mean length24.4227635
Min length7

Characters and Unicode

Total characters545458
Distinct characters52
Distinct categories7 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2 ?
Unique (%)< 0.1%

Sample

1st rowPart 91: General Aviation
2nd rowPart 91: General Aviation
3rd rowPart 91: General Aviation
4th rowPart 91: General Aviation
5th rowPart 91: General Aviation

Common Values

ValueCountFrequency (%)
Part 91: General Aviation17958
 
22.6%
Part 137: Agricultural1104
 
1.4%
Non-U.S., Non-Commercial771
 
1.0%
Part 135: Air Taxi & Commuter763
 
1.0%
Part 121: Air Carrier525
 
0.7%
Non-U.S., Commercial514
 
0.6%
Part 129: Foreign194
 
0.2%
Unknown181
 
0.2%
Public Use179
 
0.2%
Part 133: Rotorcraft Ext. Load96
 
0.1%
Other values (7)49
 
0.1%
(Missing)56959
71.8%

Length

2021-09-06T13:01:46.002354image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
part20670
24.0%
9117971
20.9%
aviation17958
20.8%
general17958
20.8%
air1288
 
1.5%
non-u.s1285
 
1.5%
agricultural1104
 
1.3%
1371104
 
1.3%
non-commercial771
 
0.9%
taxi763
 
0.9%
Other values (34)5293
 
6.1%

Most occurring characters

ValueCountFrequency (%)
63831
11.7%
a60524
11.1%
r46206
 
8.5%
i41302
 
7.6%
t40839
 
7.5%
e38879
 
7.1%
n38722
 
7.1%
o22746
 
4.2%
l21682
 
4.0%
121194
 
3.9%
Other values (42)149533
27.4%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter342025
62.7%
Uppercase Letter68060
 
12.5%
Space Separator63831
 
11.7%
Decimal Number44080
 
8.1%
Other Punctuation25392
 
4.7%
Dash Punctuation2056
 
0.4%
Math Symbol14
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a60524
17.7%
r46206
13.5%
i41302
12.1%
t40839
11.9%
e38879
11.4%
n38722
11.3%
o22746
 
6.7%
l21682
 
6.3%
v17958
 
5.3%
m4105
 
1.2%
Other values (12)9062
 
2.6%
Uppercase Letter
ValueCountFrequency (%)
P20868
30.7%
A20369
29.9%
G17958
26.4%
C2574
 
3.8%
N2056
 
3.0%
U1653
 
2.4%
S1300
 
1.9%
T763
 
1.1%
F217
 
0.3%
R96
 
0.1%
Other values (4)206
 
0.3%
Decimal Number
ValueCountFrequency (%)
121194
48.1%
918166
41.2%
32068
 
4.7%
71105
 
2.5%
5770
 
1.7%
2733
 
1.7%
036
 
0.1%
67
 
< 0.1%
41
 
< 0.1%
Other Punctuation
ValueCountFrequency (%)
:20670
81.4%
.2667
 
10.5%
,1292
 
5.1%
&763
 
3.0%
Space Separator
ValueCountFrequency (%)
63831
100.0%
Dash Punctuation
ValueCountFrequency (%)
-2056
100.0%
Math Symbol
ValueCountFrequency (%)
+14
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin410085
75.2%
Common135373
 
24.8%

Most frequent character per script

Latin
ValueCountFrequency (%)
a60524
14.8%
r46206
11.3%
i41302
10.1%
t40839
10.0%
e38879
9.5%
n38722
9.4%
o22746
 
5.5%
l21682
 
5.3%
P20868
 
5.1%
A20369
 
5.0%
Other values (26)57948
14.1%
Common
ValueCountFrequency (%)
63831
47.2%
121194
 
15.7%
:20670
 
15.3%
918166
 
13.4%
.2667
 
2.0%
32068
 
1.5%
-2056
 
1.5%
,1292
 
1.0%
71105
 
0.8%
5770
 
0.6%
Other values (6)1554
 
1.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII545458
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
63831
11.7%
a60524
11.1%
r46206
 
8.5%
i41302
 
7.6%
t40839
 
7.5%
e38879
 
7.1%
n38722
 
7.1%
o22746
 
4.2%
l21682
 
4.0%
121194
 
3.9%
Other values (42)149533
27.4%

Schedule
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct3
Distinct (%)< 0.1%
Missing67792
Missing (%)85.5%
Memory size619.6 KiB
UNK
4099 
NSCH
3866 
SCHD
3536 

Length

Max length4
Median length4
Mean length3.643596209
Min length3

Characters and Unicode

Total characters41905
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowNSCH
2nd rowNSCH
3rd rowSCHD
4th rowSCHD
5th rowNSCH

Common Values

ValueCountFrequency (%)
UNK4099
 
5.2%
NSCH3866
 
4.9%
SCHD3536
 
4.5%
(Missing)67792
85.5%

Length

2021-09-06T13:01:46.120066image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-06T13:01:46.159679image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
unk4099
35.6%
nsch3866
33.6%
schd3536
30.7%

Most occurring characters

ValueCountFrequency (%)
N7965
19.0%
S7402
17.7%
C7402
17.7%
H7402
17.7%
U4099
9.8%
K4099
9.8%
D3536
8.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter41905
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N7965
19.0%
S7402
17.7%
C7402
17.7%
H7402
17.7%
U4099
9.8%
K4099
9.8%
D3536
8.4%

Most occurring scripts

ValueCountFrequency (%)
Latin41905
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
N7965
19.0%
S7402
17.7%
C7402
17.7%
H7402
17.7%
U4099
9.8%
K4099
9.8%
D3536
8.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII41905
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N7965
19.0%
S7402
17.7%
C7402
17.7%
H7402
17.7%
U4099
9.8%
K4099
9.8%
D3536
8.4%

Purpose.of.Flight
Categorical

HIGH CORRELATION
HIGH CORRELATION
MISSING

Distinct22
Distinct (%)< 0.1%
Missing3894
Missing (%)4.9%
Memory size619.6 KiB
Personal
44550 
Instructional
9487 
Unknown
6771 
Aerial Application
 
4369
Business
 
3868
Other values (17)
6354 

Length

Max length25
Median length8
Mean length9.543468746
Min length5

Characters and Unicode

Total characters719568
Distinct characters41
Distinct categories5 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPersonal
2nd rowPersonal
3rd rowPersonal
4th rowPersonal
5th rowPersonal

Common Values

ValueCountFrequency (%)
Personal44550
56.2%
Instructional9487
 
12.0%
Unknown6771
 
8.5%
Aerial Application4369
 
5.5%
Business3868
 
4.9%
Positioning1507
 
1.9%
Other Work Use1121
 
1.4%
Ferry775
 
1.0%
Public Aircraft707
 
0.9%
Aerial Observation673
 
0.8%
Other values (12)1571
 
2.0%
(Missing)3894
 
4.9%

Length

2021-09-06T13:01:46.419038image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
personal44550
52.6%
instructional9487
 
11.2%
unknown6771
 
8.0%
aerial5042
 
6.0%
application4369
 
5.2%
business3868
 
4.6%
positioning1507
 
1.8%
work1121
 
1.3%
use1121
 
1.3%
other1121
 
1.3%
Other values (21)5713
 
6.7%

Most occurring characters

ValueCountFrequency (%)
n96182
13.4%
o71434
9.9%
s69258
9.6%
r66872
9.3%
a66136
9.2%
l64940
9.0%
e59618
8.3%
P46964
6.5%
i35547
 
4.9%
t29431
 
4.1%
Other values (31)113186
15.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter624305
86.8%
Uppercase Letter85131
 
11.8%
Space Separator9271
 
1.3%
Other Punctuation661
 
0.1%
Dash Punctuation200
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
n96182
15.4%
o71434
11.4%
s69258
11.1%
r66872
10.7%
a66136
10.6%
l64940
10.4%
e59618
9.5%
i35547
 
5.7%
t29431
 
4.7%
c16386
 
2.6%
Other values (12)48501
7.8%
Uppercase Letter
ValueCountFrequency (%)
P46964
55.2%
A10475
 
12.3%
I9487
 
11.1%
U7892
 
9.3%
B3949
 
4.6%
O1794
 
2.1%
F1200
 
1.4%
W1121
 
1.3%
E598
 
0.7%
C515
 
0.6%
Other values (6)1136
 
1.3%
Space Separator
ValueCountFrequency (%)
9271
100.0%
Dash Punctuation
ValueCountFrequency (%)
-200
100.0%
Other Punctuation
ValueCountFrequency (%)
/661
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin709436
98.6%
Common10132
 
1.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
n96182
13.6%
o71434
10.1%
s69258
9.8%
r66872
9.4%
a66136
9.3%
l64940
9.2%
e59618
8.4%
P46964
6.6%
i35547
 
5.0%
t29431
 
4.1%
Other values (28)103054
14.5%
Common
ValueCountFrequency (%)
9271
91.5%
/661
 
6.5%
-200
 
2.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII719568
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
n96182
13.4%
o71434
9.9%
s69258
9.6%
r66872
9.3%
a66136
9.2%
l64940
9.0%
e59618
8.3%
P46964
6.5%
i35547
 
4.9%
t29431
 
4.1%
Other values (31)113186
15.7%

Air.Carrier
Categorical

HIGH CARDINALITY
MISSING

Distinct2866
Distinct (%)73.1%
Missing75375
Missing (%)95.1%
Memory size619.6 KiB
UNITED AIRLINES
 
49
AMERICAN AIRLINES
 
41
CONTINENTAL AIRLINES
 
25
DELTA AIR LINES INC
 
24
SOUTHWEST AIRLINES CO
 
24
Other values (2861)
3755 

Length

Max length90
Median length22
Mean length25.94793262
Min length3

Characters and Unicode

Total characters101664
Distinct characters73
Distinct categories8 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique2456 ?
Unique (%)62.7%

Sample

1st rowAerowest Aviation (DBA: Redtail Air)
2nd rowKey Lime Air (DBA: Key Lime Air)
3rd rowSKYWEST AIRLINES INC
4th rowGoJet Airlines, LLC. (DBA: Delta Connection)
5th rowFlight Development, LLC

Common Values

ValueCountFrequency (%)
UNITED AIRLINES49
 
0.1%
AMERICAN AIRLINES41
 
0.1%
CONTINENTAL AIRLINES25
 
< 0.1%
DELTA AIR LINES INC24
 
< 0.1%
SOUTHWEST AIRLINES CO24
 
< 0.1%
USAIR24
 
< 0.1%
AMERICAN AIRLINES, INC.22
 
< 0.1%
CONTINENTAL AIRLINES, INC.19
 
< 0.1%
AMERICAN AIRLINES INC17
 
< 0.1%
UNITED AIR LINES INC15
 
< 0.1%
Other values (2856)3658
 
4.6%
(Missing)75375
95.1%

Length

2021-09-06T13:01:46.574635image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
inc1682
 
11.2%
dba1372
 
9.1%
air1256
 
8.4%
airlines1146
 
7.6%
aviation437
 
2.9%
service344
 
2.3%
express313
 
2.1%
american270
 
1.8%
airways239
 
1.6%
united187
 
1.2%
Other values (1699)7784
51.8%

Most occurring characters

ValueCountFrequency (%)
11112
 
10.9%
A9537
 
9.4%
I9107
 
9.0%
E6597
 
6.5%
R5875
 
5.8%
N5873
 
5.8%
S4759
 
4.7%
C4048
 
4.0%
L3497
 
3.4%
T3472
 
3.4%
Other values (63)37787
37.2%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter69324
68.2%
Lowercase Letter14302
 
14.1%
Space Separator11112
 
10.9%
Other Punctuation3974
 
3.9%
Open Punctuation1408
 
1.4%
Close Punctuation1407
 
1.4%
Decimal Number81
 
0.1%
Dash Punctuation56
 
0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A9537
13.8%
I9107
13.1%
E6597
9.5%
R5875
8.5%
N5873
8.5%
S4759
 
6.9%
C4048
 
5.8%
L3497
 
5.0%
T3472
 
5.0%
O2582
 
3.7%
Other values (16)13977
20.2%
Lowercase Letter
ValueCountFrequency (%)
i1946
13.6%
e1612
11.3%
r1568
11.0%
n1477
10.3%
a1220
8.5%
s1009
7.1%
t1001
7.0%
l840
 
5.9%
o709
 
5.0%
c695
 
4.9%
Other values (16)2225
15.6%
Decimal Number
ValueCountFrequency (%)
023
28.4%
213
16.0%
413
16.0%
612
14.8%
58
 
9.9%
15
 
6.2%
73
 
3.7%
82
 
2.5%
32
 
2.5%
Other Punctuation
ValueCountFrequency (%)
.1427
35.9%
:1366
34.4%
,1080
27.2%
'57
 
1.4%
&25
 
0.6%
/19
 
0.5%
Open Punctuation
ValueCountFrequency (%)
(1377
97.8%
[31
 
2.2%
Close Punctuation
ValueCountFrequency (%)
)1376
97.8%
]31
 
2.2%
Space Separator
ValueCountFrequency (%)
11112
100.0%
Dash Punctuation
ValueCountFrequency (%)
-56
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin83626
82.3%
Common18038
 
17.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
A9537
 
11.4%
I9107
 
10.9%
E6597
 
7.9%
R5875
 
7.0%
N5873
 
7.0%
S4759
 
5.7%
C4048
 
4.8%
L3497
 
4.2%
T3472
 
4.2%
O2582
 
3.1%
Other values (42)28279
33.8%
Common
ValueCountFrequency (%)
11112
61.6%
.1427
 
7.9%
(1377
 
7.6%
)1376
 
7.6%
:1366
 
7.6%
,1080
 
6.0%
'57
 
0.3%
-56
 
0.3%
[31
 
0.2%
]31
 
0.2%
Other values (11)125
 
0.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII101664
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
11112
 
10.9%
A9537
 
9.4%
I9107
 
9.0%
E6597
 
6.5%
R5875
 
5.8%
N5873
 
5.8%
S4759
 
4.7%
C4048
 
4.0%
L3497
 
3.4%
T3472
 
3.4%
Other values (63)37787
37.2%

Total.Fatal.Injuries
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
SKEWED
ZEROS

Distinct122
Distinct (%)0.2%
Missing23309
Missing (%)29.4%
Infinite0
Infinite (%)0.0%
Mean0.8146791941
Minimum0
Maximum349
Zeros40092
Zeros (%)50.6%
Negative0
Negative (%)0.0%
Memory size619.6 KiB
2021-09-06T13:01:46.647208image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile3
Maximum349
Range349
Interquartile range (IQR)1

Descriptive statistics

Standard deviation6.23370003
Coefficient of variation (CV)7.651723618
Kurtosis1082.130068
Mean0.8146791941
Median Absolute Deviation (MAD)0
Skewness29.51903889
Sum45609
Variance38.85901607
MonotonicityNot monotonic
2021-09-06T13:01:46.719295image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
040092
50.6%
17847
 
9.9%
24619
 
5.8%
31451
 
1.8%
41012
 
1.3%
5311
 
0.4%
6196
 
0.2%
783
 
0.1%
865
 
0.1%
1042
 
0.1%
Other values (112)266
 
0.3%
(Missing)23309
29.4%
ValueCountFrequency (%)
040092
50.6%
17847
 
9.9%
24619
 
5.8%
31451
 
1.8%
41012
 
1.3%
5311
 
0.4%
6196
 
0.2%
783
 
0.1%
865
 
0.1%
936
 
< 0.1%
ValueCountFrequency (%)
3492
< 0.1%
2951
< 0.1%
2701
< 0.1%
2651
< 0.1%
2561
< 0.1%
2391
< 0.1%
2301
< 0.1%
2291
< 0.1%
2282
< 0.1%
2241
< 0.1%

Total.Serious.Injuries
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
SKEWED
ZEROS

Distinct40
Distinct (%)0.1%
Missing25551
Missing (%)32.2%
Infinite0
Infinite (%)0.0%
Mean0.3177031
Minimum0
Maximum111
Zeros42660
Zeros (%)53.8%
Negative0
Negative (%)0.0%
Memory size619.6 KiB
2021-09-06T13:01:46.794016image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile2
Maximum111
Range111
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1.372924237
Coefficient of variation (CV)4.321406485
Kurtosis2213.530689
Mean0.3177031
Median Absolute Deviation (MAD)0
Skewness37.45361961
Sum17074
Variance1.884920959
MonotonicityNot monotonic
2021-09-06T13:01:46.859452image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=40)
ValueCountFrequency (%)
042660
53.8%
18016
 
10.1%
22177
 
2.7%
3502
 
0.6%
4205
 
0.3%
565
 
0.1%
625
 
< 0.1%
722
 
< 0.1%
87
 
< 0.1%
97
 
< 0.1%
Other values (30)56
 
0.1%
(Missing)25551
32.2%
ValueCountFrequency (%)
042660
53.8%
18016
 
10.1%
22177
 
2.7%
3502
 
0.6%
4205
 
0.3%
565
 
0.1%
625
 
< 0.1%
722
 
< 0.1%
87
 
< 0.1%
97
 
< 0.1%
ValueCountFrequency (%)
1111
 
< 0.1%
1061
 
< 0.1%
811
 
< 0.1%
661
 
< 0.1%
601
 
< 0.1%
592
< 0.1%
551
 
< 0.1%
503
< 0.1%
471
 
< 0.1%
451
 
< 0.1%

Total.Minor.Injuries
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
SKEWED
ZEROS

Distinct62
Distinct (%)0.1%
Missing24460
Missing (%)30.8%
Infinite0
Infinite (%)0.0%
Mean0.5025805628
Minimum0
Maximum380
Zeros40064
Zeros (%)50.5%
Negative0
Negative (%)0.0%
Memory size619.6 KiB
2021-09-06T13:01:46.956407image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q31
95-th percentile2
Maximum380
Range380
Interquartile range (IQR)1

Descriptive statistics

Standard deviation2.781994444
Coefficient of variation (CV)5.53541989
Kurtosis7360.610457
Mean0.5025805628
Median Absolute Deviation (MAD)0
Skewness66.84890696
Sum27558
Variance7.739493085
MonotonicityNot monotonic
2021-09-06T13:01:47.075442image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
040064
50.5%
19519
 
12.0%
23575
 
4.5%
3799
 
1.0%
4389
 
0.5%
5136
 
0.2%
672
 
0.1%
758
 
0.1%
928
 
< 0.1%
823
 
< 0.1%
Other values (52)170
 
0.2%
(Missing)24460
30.8%
ValueCountFrequency (%)
040064
50.5%
19519
 
12.0%
23575
 
4.5%
3799
 
1.0%
4389
 
0.5%
5136
 
0.2%
672
 
0.1%
758
 
0.1%
823
 
< 0.1%
928
 
< 0.1%
ValueCountFrequency (%)
3801
< 0.1%
2001
< 0.1%
1711
< 0.1%
1371
< 0.1%
1251
< 0.1%
961
< 0.1%
881
< 0.1%
841
< 0.1%
711
< 0.1%
691
< 0.1%

Total.Uninjured
Real number (ℝ≥0)

HIGH CORRELATION
MISSING
ZEROS

Distinct364
Distinct (%)0.5%
Missing12344
Missing (%)15.6%
Infinite0
Infinite (%)0.0%
Mean5.790885599
Minimum0
Maximum699
Zeros19126
Zeros (%)24.1%
Negative0
Negative (%)0.0%
Memory size619.6 KiB
2021-09-06T13:01:47.154465image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median1
Q32
95-th percentile6
Maximum699
Range699
Interquartile range (IQR)2

Descriptive statistics

Standard deviation29.22301628
Coefficient of variation (CV)5.046381211
Kurtosis101.5231507
Mean5.790885599
Median Absolute Deviation (MAD)1
Skewness8.950895422
Sum387694
Variance853.9846806
MonotonicityNot monotonic
2021-09-06T13:01:47.228585image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
122583
28.5%
019126
24.1%
214316
18.1%
33949
 
5.0%
42508
 
3.2%
5817
 
1.0%
6449
 
0.6%
7253
 
0.3%
8141
 
0.2%
9111
 
0.1%
Other values (354)2696
 
3.4%
(Missing)12344
15.6%
ValueCountFrequency (%)
019126
24.1%
122583
28.5%
214316
18.1%
33949
 
5.0%
42508
 
3.2%
5817
 
1.0%
6449
 
0.6%
7253
 
0.3%
8141
 
0.2%
9111
 
0.1%
ValueCountFrequency (%)
6992
< 0.1%
5882
< 0.1%
5762
< 0.1%
5732
< 0.1%
5581
< 0.1%
5282
< 0.1%
5071
< 0.1%
5012
< 0.1%
4952
< 0.1%
4612
< 0.1%

Weather.Condition
Categorical

MISSING

Distinct3
Distinct (%)< 0.1%
Missing2157
Missing (%)2.7%
Memory size619.6 KiB
VMC
70507 
IMC
 
5660
UNK
 
969

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters231408
Distinct characters7
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowVMC
2nd rowVMC
3rd rowVMC
4th rowVMC
5th rowVMC

Common Values

ValueCountFrequency (%)
VMC70507
88.9%
IMC5660
 
7.1%
UNK969
 
1.2%
(Missing)2157
 
2.7%

Length

2021-09-06T13:01:47.373962image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-06T13:01:47.413977image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
vmc70507
91.4%
imc5660
 
7.3%
unk969
 
1.3%

Most occurring characters

ValueCountFrequency (%)
M76167
32.9%
C76167
32.9%
V70507
30.5%
I5660
 
2.4%
U969
 
0.4%
N969
 
0.4%
K969
 
0.4%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter231408
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M76167
32.9%
C76167
32.9%
V70507
30.5%
I5660
 
2.4%
U969
 
0.4%
N969
 
0.4%
K969
 
0.4%

Most occurring scripts

ValueCountFrequency (%)
Latin231408
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M76167
32.9%
C76167
32.9%
V70507
30.5%
I5660
 
2.4%
U969
 
0.4%
N969
 
0.4%
K969
 
0.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII231408
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M76167
32.9%
C76167
32.9%
V70507
30.5%
I5660
 
2.4%
U969
 
0.4%
N969
 
0.4%
K969
 
0.4%

Broad.Phase.of.Flight
Categorical

HIGH CORRELATION
MISSING

Distinct12
Distinct (%)< 0.1%
Missing6054
Missing (%)7.6%
Memory size619.6 KiB
LANDING
19209 
TAKEOFF
15284 
CRUISE
10749 
MANEUVERING
9818 
APPROACH
7720 
Other values (7)
10459 

Length

Max length11
Median length7
Mean length7.393779271
Min length4

Characters and Unicode

Total characters541513
Distinct characters23
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCRUISE
2nd rowLANDING
3rd rowTAKEOFF
4th rowTAKEOFF
5th rowDESCENT

Common Values

ValueCountFrequency (%)
LANDING19209
24.2%
TAKEOFF15284
19.3%
CRUISE10749
13.6%
MANEUVERING9818
12.4%
APPROACH7720
9.7%
TAXI2322
 
2.9%
CLIMB2279
 
2.9%
DESCENT2202
 
2.8%
GO-AROUND1608
 
2.0%
STANDING1219
 
1.5%
Other values (2)829
 
1.0%
(Missing)6054
 
7.6%

Length

2021-09-06T13:01:47.537041image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
landing19209
26.2%
takeoff15284
20.9%
cruise10749
14.7%
maneuvering9818
13.4%
approach7720
10.5%
taxi2322
 
3.2%
climb2279
 
3.1%
descent2202
 
3.0%
go-around1608
 
2.2%
standing1219
 
1.7%
Other values (2)829
 
1.1%

Most occurring characters

ValueCountFrequency (%)
N66318
12.2%
A64900
12.0%
E50230
 
9.3%
I45596
 
8.4%
G31854
 
5.9%
F30568
 
5.6%
R30052
 
5.5%
O27049
 
5.0%
D24238
 
4.5%
C22950
 
4.2%
Other values (13)147758
27.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter539905
99.7%
Dash Punctuation1608
 
0.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
N66318
12.3%
A64900
12.0%
E50230
 
9.3%
I45596
 
8.4%
G31854
 
5.9%
F30568
 
5.7%
R30052
 
5.6%
O27049
 
5.0%
D24238
 
4.5%
C22950
 
4.3%
Other values (12)146150
27.1%
Dash Punctuation
ValueCountFrequency (%)
-1608
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin539905
99.7%
Common1608
 
0.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
N66318
12.3%
A64900
12.0%
E50230
 
9.3%
I45596
 
8.4%
G31854
 
5.9%
F30568
 
5.7%
R30052
 
5.6%
O27049
 
5.0%
D24238
 
4.5%
C22950
 
4.3%
Other values (12)146150
27.1%
Common
ValueCountFrequency (%)
-1608
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII541513
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
N66318
12.2%
A64900
12.0%
E50230
 
9.3%
I45596
 
8.4%
G31854
 
5.9%
F30568
 
5.6%
R30052
 
5.5%
O27049
 
5.0%
D24238
 
4.5%
C22950
 
4.2%
Other values (13)147758
27.3%

Report.Status
Categorical

HIGH CORRELATION
HIGH CORRELATION

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size619.6 KiB
Probable Cause
73923 
Foreign
 
3966
Preliminary
 
1090
Factual
 
314

Length

Max length14
Median length14
Mean length13.58092139
Min length7

Characters and Unicode

Total characters1076872
Distinct characters19
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowPreliminary
2nd rowPreliminary
3rd rowPreliminary
4th rowPreliminary
5th rowPreliminary

Common Values

ValueCountFrequency (%)
Probable Cause73923
93.2%
Foreign3966
 
5.0%
Preliminary1090
 
1.4%
Factual314
 
0.4%

Length

2021-09-06T13:01:47.666980image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category

Pie chart

2021-09-06T13:01:47.711214image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
ValueCountFrequency (%)
probable73923
48.2%
cause73923
48.2%
foreign3966
 
2.6%
preliminary1090
 
0.7%
factual314
 
0.2%

Most occurring characters

ValueCountFrequency (%)
e152902
14.2%
a149564
13.9%
b147846
13.7%
r80069
7.4%
o77889
7.2%
l75327
7.0%
P75013
7.0%
u74237
6.9%
73923
6.9%
C73923
6.9%
Other values (9)96179
8.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter849733
78.9%
Uppercase Letter153216
 
14.2%
Space Separator73923
 
6.9%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e152902
18.0%
a149564
17.6%
b147846
17.4%
r80069
9.4%
o77889
9.2%
l75327
8.9%
u74237
8.7%
s73923
8.7%
i6146
 
0.7%
n5056
 
0.6%
Other values (5)6774
 
0.8%
Uppercase Letter
ValueCountFrequency (%)
P75013
49.0%
C73923
48.2%
F4280
 
2.8%
Space Separator
ValueCountFrequency (%)
73923
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin1002949
93.1%
Common73923
 
6.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e152902
15.2%
a149564
14.9%
b147846
14.7%
r80069
8.0%
o77889
7.8%
l75327
7.5%
P75013
7.5%
u74237
7.4%
C73923
7.4%
s73923
7.4%
Other values (8)22256
 
2.2%
Common
ValueCountFrequency (%)
73923
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII1076872
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e152902
14.2%
a149564
13.9%
b147846
13.7%
r80069
7.4%
o77889
7.2%
l75327
7.0%
P75013
7.0%
u74237
6.9%
73923
6.9%
C73923
6.9%
Other values (9)96179
8.9%

Publication.Date
Categorical

HIGH CARDINALITY
MISSING

Distinct3591
Distinct (%)5.5%
Missing13474
Missing (%)17.0%
Memory size619.6 KiB
31/03/1993
 
452
25/11/2003
 
396
15/02/2001
 
376
13/09/2005
 
325
14/09/1993
 
312
Other values (3586)
63958 

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters658190
Distinct characters11
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique1354 ?
Unique (%)2.1%

Sample

1st row05/01/2017
2nd row05/01/2017
3rd row03/01/2017
4th row29/12/2016
5th row05/01/2017

Common Values

ValueCountFrequency (%)
31/03/1993452
 
0.6%
25/11/2003396
 
0.5%
15/02/2001376
 
0.5%
13/09/2005325
 
0.4%
14/09/1993312
 
0.4%
09/11/1992282
 
0.4%
02/03/2001277
 
0.3%
27/10/2005276
 
0.3%
16/02/2001269
 
0.3%
14/12/1992266
 
0.3%
Other values (3581)62588
78.9%
(Missing)13474
 
17.0%

Length

2021-09-06T13:01:47.843294image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Histogram of lengths of the category
ValueCountFrequency (%)
31/03/1993452
 
0.7%
25/11/2003396
 
0.6%
15/02/2001376
 
0.6%
13/09/2005325
 
0.5%
14/09/1993312
 
0.5%
09/11/1992282
 
0.4%
02/03/2001277
 
0.4%
27/10/2005276
 
0.4%
16/02/2001269
 
0.4%
14/12/1992266
 
0.4%
Other values (3581)62588
95.1%

Most occurring characters

ValueCountFrequency (%)
0142889
21.7%
/131638
20.0%
1101455
15.4%
281522
12.4%
971924
10.9%
332572
 
4.9%
827181
 
4.1%
519488
 
3.0%
618369
 
2.8%
417042
 
2.6%

Most occurring categories

ValueCountFrequency (%)
Decimal Number526552
80.0%
Other Punctuation131638
 
20.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0142889
27.1%
1101455
19.3%
281522
15.5%
971924
13.7%
332572
 
6.2%
827181
 
5.2%
519488
 
3.7%
618369
 
3.5%
417042
 
3.2%
714110
 
2.7%
Other Punctuation
ValueCountFrequency (%)
/131638
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common658190
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0142889
21.7%
/131638
20.0%
1101455
15.4%
281522
12.4%
971924
10.9%
332572
 
4.9%
827181
 
4.1%
519488
 
3.0%
618369
 
2.8%
417042
 
2.6%

Most occurring blocks

ValueCountFrequency (%)
ASCII658190
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0142889
21.7%
/131638
20.0%
1101455
15.4%
281522
12.4%
971924
10.9%
332572
 
4.9%
827181
 
4.1%
519488
 
3.0%
618369
 
2.8%
417042
 
2.6%

Interactions

2021-09-06T13:01:36.854053image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:36.949643image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:37.020812image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:37.090031image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:37.153994image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:37.228361image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:37.297355image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:37.367620image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:37.443539image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:37.519438image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:37.588122image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:37.651628image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:37.718495image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:37.787954image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:38.018181image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:38.089848image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:38.167908image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:38.244928image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:38.320503image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:38.400795image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:38.478762image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:38.553150image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:38.615568image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:38.682091image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:38.755449image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:38.830236image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:38.898102image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:38.971546image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:39.034786image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:39.101040image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:39.172165image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:39.250620image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:39.318731image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:39.385809image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:39.456316image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:39.520025image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:39.594856image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:39.668971image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:39.744195image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:39.815797image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:39.888371image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:39.964593image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:40.036114image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:40.105964image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:40.180456image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:40.257312image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:40.325640image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:40.469593image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
2021-09-06T13:01:40.543341image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Correlations

2021-09-06T13:01:47.897464image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2021-09-06T13:01:47.988075image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2021-09-06T13:01:48.075460image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2021-09-06T13:01:48.175234image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.
2021-09-06T13:01:48.311172image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.

Missing values

2021-09-06T13:01:40.818737image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
A simple visualization of nullity by column.
2021-09-06T13:01:41.622717image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.
2021-09-06T13:01:42.383394image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The correlation heatmap measures nullity correlation: how strongly the presence or absence of one variable affects the presence of another.
2021-09-06T13:01:42.844751image/svg+xmlMatplotlib v3.3.2, https://matplotlib.org/
The dendrogram allows you to more fully correlate variable completion, revealing trends deeper than the pairwise ones visible in the correlation heatmap.

Sample

First rows

Event.IdInvestigation.TypeAccident.NumberEvent.DateLocationCountryLatitudeLongitudeAirport.CodeAirport.NameInjury.SeverityAircraft.DamageAircraft.CategoryRegistration.NumberMakeModelAmateur.BuiltNumber.of.EnginesEngine.TypeFAR.DescriptionSchedulePurpose.of.FlightAir.CarrierTotal.Fatal.InjuriesTotal.Serious.InjuriesTotal.Minor.InjuriesTotal.UninjuredWeather.ConditionBroad.Phase.of.FlightReport.StatusPublication.Date
020170103X43747AccidentWPR17LA0462017-01-03Paradise, MTUnited StatesNaNNaNNaNNaNNon-FatalSubstantialAirplaneN710XPSOFTEX INVEST LLCV-24LYesNaNReciprocatingPart 91: General AviationNaNPersonalNaNNaN2.0NaNNaNVMCCRUISEPreliminary05/01/2017
120161230X55950AccidentWPR17FA0442016-12-29Dabob, WAUnited States47.823611-122.790000NaNNaNFatal(4)SubstantialAirplaneN52388CESSNA182No1.0ReciprocatingPart 91: General AviationNaNPersonalNaN4.0NaNNaNNaNVMCNaNPreliminary05/01/2017
220161229X93022AccidentCEN17LA0622016-12-27Piedmont, MOUnited StatesNaNNaNNaNNaNNon-FatalSubstantialAirplaneN5499ZPIPERPA22No1.0ReciprocatingPart 91: General AviationNaNPersonalNaNNaNNaNNaN1.0VMCLANDINGPreliminary03/01/2017
320161227X80237AccidentCEN17LA0612016-12-27Farmington, MOUnited States37.761111-90.428611FAMFARMINGTON RGNLNon-FatalSubstantialAirplaneN918KSMEADRV 8AYesNaNNaNPart 91: General AviationNaNPersonalNaNNaNNaN1.01.0VMCTAKEOFFPreliminary29/12/2016
420161226X80840AccidentWPR17FA0412016-12-26Fresno, CAUnited States36.844444-119.870834E79Sierra Sky ParkFatal(2)DestroyedAirplaneN176PAPETRUS DAVID WAYNES90YesNaNReciprocatingPart 91: General AviationNaNPersonalNaN2.0NaNNaNNaNVMCTAKEOFFPreliminary05/01/2017
520161227X03229AccidentERA17FA0732016-12-26Gatlinburg, TNUnited States35.651944-83.458333GKTGATLINBURG-PIGEON FORGEFatal(3)DestroyedAirplaneN1839XCESSNA182No1.0ReciprocatingPart 91: General AviationNaNPersonalNaN3.0NaNNaNNaNIMCDESCENTPreliminary03/01/2017
620161223X22808AccidentERA17FA0722016-12-23Middlebury, VTUnited States43.981389-73.0944446B0MIDDLEBURY STATEFatal(1)SubstantialAirplaneN31202PIPERPA28No1.0ReciprocatingPart 91: General AviationNaNPersonalNaN1.0NaNNaNNaNVMCTAKEOFFPreliminary04/01/2017
720161221X11609IncidentENG17WA0072016-12-21Toronto, CanadaCanadaNaNNaNNaNNaNIncidentNaNAirplaneNaNBOEING767NoNaNNaNNaNNaNNaNNaNNaNNaNNaN224.0NaNTAKEOFFForeignNaN
820161222X21701AccidentCEN17LA0602016-12-21Millersburg, OHUnited States40.53666681.95583310GHolmes CountyNon-FatalSubstantialAirplaneN8381TCESSNA175CNo1.0ReciprocatingPart 91: General AviationNaNPersonalNaNNaNNaN1.01.0VMCAPPROACHPreliminary27/12/2016
920161220X20645AccidentCEN17LA0582016-12-18Blaine, MNUnited States45.195555-93.162778NaNNaNNon-FatalSubstantialAirplaneN4204BBELLANCA17 30ANo1.0ReciprocatingPart 91: General AviationNaNPersonalNaNNaNNaNNaN1.0VMCNaNPreliminary05/01/2017

Last rows

Event.IdInvestigation.TypeAccident.NumberEvent.DateLocationCountryLatitudeLongitudeAirport.CodeAirport.NameInjury.SeverityAircraft.DamageAircraft.CategoryRegistration.NumberMakeModelAmateur.BuiltNumber.of.EnginesEngine.TypeFAR.DescriptionSchedulePurpose.of.FlightAir.CarrierTotal.Fatal.InjuriesTotal.Serious.InjuriesTotal.Minor.InjuriesTotal.UninjuredWeather.ConditionBroad.Phase.of.FlightReport.StatusPublication.Date
7928320020909X01561AccidentNYC82DA0151982-01-01EAST HANOVER, NJUnited StatesNaNNaNN58HANOVERNon-FatalSubstantialAirplaneN7967QCESSNA401BNo2.0ReciprocatingPart 91: General AviationNaNBusinessNaN0.00.00.02.0IMCLANDINGProbable Cause01/01/1982
7928420020909X01560AccidentMIA82DA0291982-01-01JACKSONVILLE, FLUnited StatesNaNNaNJAXJACKSONVILLE INTLNon-FatalSubstantialNaNN3906KNORTH AMERICANNAVION L-17BNo1.0ReciprocatingNaNNaNPersonalNaN0.00.03.00.0IMCCRUISEProbable Cause01/01/1982
7928520020909X01559AccidentFTW82DA0341982-01-01HOBBS, NMUnited StatesNaNNaNNaNNaNNon-FatalSubstantialNaNN44832PIPERPA-28-161No1.0ReciprocatingNaNNaNPersonalNaN0.00.00.01.0VMCAPPROACHProbable Cause01/01/1982
7928620020909X01558AccidentATL82DKJ101982-01-01TUSKEGEE, ALUnited StatesNaNNaNNaNTUSKEGEENon-FatalSubstantialNaNN4275SBEECHV35BNo1.0ReciprocatingNaNNaNPersonalNaN0.00.00.01.0VMCLANDINGProbable Cause01/01/1982
7928720001218X45446AccidentCHI81LA1061981-08-01COTTON, MNUnited StatesNaNNaNNaNNaNFatal(4)DestroyedNaNN4988ECESSNA180No1.0ReciprocatingNaNNaNPersonalNaN4.00.00.00.0IMCUNKNOWNProbable Cause06/11/2001
7928820041105X01764AccidentCHI79FA0641979-08-02Canton, OHUnited StatesNaNNaNNaNNaNFatal(1)DestroyedNaNN15NYCessna501NoNaNNaNNaNNaNPersonalNaN1.02.0NaNNaNVMCAPPROACHProbable Cause16/04/1980
7928920001218X45448AccidentLAX96LA3211977-06-19EUREKA, CAUnited StatesNaNNaNNaNNaNFatal(2)DestroyedNaNN1168JRockwell112No1.0ReciprocatingNaNNaNPersonalNaN2.00.00.00.0IMCCRUISEProbable Cause12/09/2000
7929020061025X01555AccidentNYC07LA0051974-08-30Saltville, VAUnited States36.922223-81.878056NaNNaNFatal(3)DestroyedNaNN5142RCessna172MNo1.0ReciprocatingNaNNaNPersonalNaN3.0NaNNaNNaNIMCCRUISEProbable Cause26/02/2007
7929120001218X45447AccidentLAX94LA3361962-07-19BRIDGEPORT, CAUnited StatesNaNNaNNaNNaNFatal(4)DestroyedNaNN5069PPIPERPA24-180No1.0ReciprocatingNaNNaNPersonalNaN4.00.00.00.0UNKUNKNOWNProbable Cause19/09/1996
7929220001218X45444AccidentSEA87LA0801948-10-24MOOSE CREEK, IDUnited StatesNaNNaNNaNNaNFatal(2)DestroyedNaNNC6404STINSON108-3No1.0ReciprocatingNaNNaNPersonalNaN2.00.00.00.0UNKCRUISEProbable CauseNaN